-------------------------------------------------------------------------------- log: D:\wf\work\wf-all.log log type: text opened on: 24 Oct 2008, 09:40:49 . . // program: wf-all.do \ for stata 9 . // task: run wf examples . // project: workflow book . // author: scott long \ 2008-10-24 . . do wf3.do . capture log close master . log using wf3, name(master) replace text (note: file D:\wf\work\wf3.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3.log log type: text opened on: 24 Oct 2008, 09:40:49 . . // program: wf3.do \ for stata 9 . // task: run all do-files in the order they appear . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . * introductory example . do wf3-intro.do . // !! Start of "extra" commands !! . . // NOTE: The following comments and commands would normally be put after . // the log using command. To correspond to the simple example in the . // text, they are not included in the log file for this example. . . capture log close . . // program: wf3-intro.do \ for stata 9 . // task: a simple do file . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup - these setup commands are not shown in text . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and check descriptive statistics . . // !! End of "extra commands" !! . . log using wf3-intro, replace text (note: file D:\wf\work\wf3-intro.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-intro.log log type: text opened on: 24 Oct 2008, 09:40:49 . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . summarize lfp age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 age | 753 42.53785 8.072574 30 60 . log close log: D:\wf\work\wf3-intro.log log type: text closed on: 24 Oct 2008, 09:40:49 -------------------------------------------------------------------------------- . exit end of do-file . . * robust do-files . do wf3-step1.do . // !! Start of "extra" commands !! . . // NOTE: The following comments and commands would normally be put after . // the log using command. To correspond to the simple example in the . // text, they are not included in the log file for this example. . . capture log close . . // program: wf3-step1.do \ for stata 9 . // task: Step 1 of job sequence to illustrate why files . // need to be self-contained. . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // create variables for having children of different ages . . // !! End of "extra commands" !! . . log using wf3-step1, replace text (note: file D:\wf\work\wf3-step1.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-step1.log log type: text opened on: 24 Oct 2008, 09:40:49 . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . generate hask5 = (k5>0) & (k5<.) . label var hask5 "Has children less than 5 yrs old?" . generate hask618 = (k618>0) & (k618<.) . label var hask618 "Has children between 6 and 18 yrs old?" . . log close log: D:\wf\work\wf3-step1.log log type: text closed on: 24 Oct 2008, 09:40:49 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-step2.do . // !! Start of "extra" commands !! . . // NOTE: The following comments and commands would normally be put after . // the log using command. To correspond to the simple example in the . // text, they are not included in the log file for this example. . . capture log close . . // program: wf3-step2.do \ for stata 9 . // task: Step 2 of job sequence to illustrate why files . // need to be self-contained. . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . . // #1 . // estimate logit model based on variable created in prior step . . // !! End of "extra commands" !! . . log using wf3-step2, replace text (note: file D:\wf\work\wf3-step2.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-step2.log log type: text opened on: 24 Oct 2008, 09:40:49 . logit lfp hask5 hask618 age wc hc lwg inc, nolog Logistic regression Number of obs = 753 LR chi2(7) = 117.11 Prob > chi2 = 0.0000 Log likelihood = -456.31607 Pseudo R2 = 0.1137 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hask5 | -1.753548 .2403085 -7.30 0.000 -2.224544 -1.282552 hask618 | -.187838 .1903462 -0.99 0.324 -.5609098 .1852337 age | -.0624687 .0130281 -4.79 0.000 -.0880033 -.0369342 wc | .7027591 .2260325 3.11 0.002 .2597435 1.145775 hc | .132653 .2050685 0.65 0.518 -.2692739 .53458 lwg | .6037747 .1502751 4.02 0.000 .309241 .8983085 inc | -.0330497 .0081265 -4.07 0.000 -.0489774 -.017122 _cons | 3.192306 .6669531 4.79 0.000 1.885102 4.49951 ------------------------------------------------------------------------------ . log close log: D:\wf\work\wf3-step2.log log type: text closed on: 24 Oct 2008, 09:40:49 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-step1-v2.do . // !! Start of "extra" commands !! . . // NOTE: The following comments and commands would normally be put after . // the log using command. To correspond to the simple example in the . // text, they are not included in the log file for this example. . . capture log close . . // program: wf3-step1-v2.do \ for stata 9 . // task: Step 1 of job sequence to illustrate why files . // need to be self-contained. This time the program . // saves the new variables. . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // create variables for having children of different ages . . // !! End of "extra commands" !! . . log using wf3-step1-v2, replace text (note: file D:\wf\work\wf3-step1-v2.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-step1-v2.log log type: text opened on: 24 Oct 2008, 09:40:49 . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . generate hask5 = (k5>0) & (k5<.) . label var hask5 "Has children less than 5 yrs old?" . generate hask618 = (k618>0) & (k618<.) . label var hask618 "Has children between 6 and 18 yrs old?" . save wf-lfp-v2, replace file wf-lfp-v2.dta saved . . log close log: D:\wf\work\wf3-step1-v2.log log type: text closed on: 24 Oct 2008, 09:40:49 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-step2-v2.do . // !! Start of "extra" commands !! . . // NOTE: The following comments and commands would normally be put after . // the log using command. To correspond to the simple example in the . // text, they are not included in the log file for this example. . . capture log close . . // program: wf3-step2-v2.do \ for stata 9 . // task: Step 2 of job sequence to illustrate why files . // need to be self-contained. The version loads the . // dataset created in step 1. . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . . // #1 . // load data and estimate logit model . . // !! End of "extra commands" !! . . log using wf3-step2-v2, replace text (note: file D:\wf\work\wf3-step2-v2.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-step2-v2.log log type: text opened on: 24 Oct 2008, 09:40:49 . use wf-lfp-v2, clear (Workflow data on labor force participation \ 2008-04-02) . logit lfp hask5 hask618 age wc hc lwg inc, nolog Logistic regression Number of obs = 753 LR chi2(7) = 117.11 Prob > chi2 = 0.0000 Log likelihood = -456.31607 Pseudo R2 = 0.1137 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hask5 | -1.753548 .2403085 -7.30 0.000 -2.224544 -1.282552 hask618 | -.187838 .1903462 -0.99 0.324 -.5609098 .1852337 age | -.0624687 .0130281 -4.79 0.000 -.0880033 -.0369342 wc | .7027591 .2260325 3.11 0.002 .2597435 1.145775 hc | .132653 .2050685 0.65 0.518 -.2692739 .53458 lwg | .6037747 .1502751 4.02 0.000 .309241 .8983085 inc | -.0330497 .0081265 -4.07 0.000 -.0489774 -.017122 _cons | 3.192306 .6669531 4.79 0.000 1.885102 4.49951 ------------------------------------------------------------------------------ . log close log: D:\wf\work\wf3-step2-v2.log log type: text closed on: 24 Oct 2008, 09:40:49 -------------------------------------------------------------------------------- . . exit end of do-file . . * legible do-files . do wf3-longcommand.do . capture log close . log using wf3-longcommand, replace text (note: file D:\wf\work\wf3-longcommand.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-longcommand.log log type: text opened on: 24 Oct 2008, 09:40:49 . . // program: wf3-longcommand.do \ for stata 9 . // task: example of problems with long command lines . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup - these commands are explained later in the book . . version 9.2 . set linesize 200 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-longcommand, clear (Workflow data for long command example \ 2008-04-02) . . // #2 . // estimate model . . mlogit jobchoice income origin prestigepar aptitude siblings friends scale1_std demands interestlvl jobgoal scale3 scale2_std motivation parented city female, noconstant baseoutcome(1) Iteration 0: log likelihood = -181.27103 Iteration 1: log likelihood = -127.22576 Iteration 2: log likelihood = -123.51868 Iteration 3: log likelihood = -123.27821 Iteration 4: log likelihood = -123.27561 Iteration 5: log likelihood = -123.27561 Multinomial logistic regression Number of obs = 165 LR chi2(32) = 115.99 Prob > chi2 = 0.0000 Log likelihood = -123.27561 Pseudo R2 = 0.3199 ------------------------------------------------------------------------------ jobchoice | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 2 | income | .6843505 .2755696 2.48 0.013 .1442439 1.224457 origin | -.6908639 .6291814 -1.10 0.272 -1.924037 .542309 prestigepar | -.7701228 .3097407 -2.49 0.013 -1.377203 -.1630422 aptitude | -.0861294 .27721 -0.31 0.756 -.629451 .4571922 siblings | -.5251506 .3478754 -1.51 0.131 -1.206974 .1566727 friends | -.4039155 .326036 -1.24 0.215 -1.042934 .2351034 scale1_std | .6897483 .2705481 2.55 0.011 .1594838 1.220013 demands | -.1528215 .2692571 -0.57 0.570 -.6805557 .3749128 interestlvl | .1212952 .2247168 0.54 0.589 -.3191415 .561732 jobgoal | .1456197 .2790999 0.52 0.602 -.401406 .6926454 scale3 | .2180359 .3452164 0.63 0.528 -.4585758 .8946476 scale2_std | -.3453626 .2802993 -1.23 0.218 -.8947392 .204014 motivation | .056955 .3188401 0.18 0.858 -.5679601 .6818701 parented | .4617825 .8054585 0.57 0.566 -1.116887 2.040452 city | -.2243885 .9614899 -0.23 0.815 -2.108874 1.660097 female | 1.25085 .711611 1.76 0.079 -.1438821 2.645582 -------------+---------------------------------------------------------------- 3 | income | .188663 .2327753 0.81 0.418 -.2675682 .6448942 origin | -.8334307 .5002381 -1.67 0.096 -1.813879 .147018 prestigepar | .013533 .213663 0.06 0.949 -.4052387 .4323048 aptitude | -.2641383 .2204725 -1.20 0.231 -.6962565 .1679799 siblings | -.2969743 .27851 -1.07 0.286 -.8428439 .2488953 friends | -.4341386 .2712158 -1.60 0.109 -.9657118 .0974346 scale1_std | .3581498 .208518 1.72 0.086 -.0505378 .7668375 demands | .0407877 .2223813 0.18 0.854 -.3950716 .4766469 interestlvl | -.3956904 .2053118 -1.93 0.054 -.7980942 .0067134 jobgoal | .1579681 .2250737 0.70 0.483 -.2831682 .5991044 scale3 | .5838574 .2765891 2.11 0.035 .0417528 1.125962 scale2_std | -.0409103 .2269731 -0.18 0.857 -.4857694 .4039488 motivation | .1090746 .2659794 0.41 0.682 -.4122354 .6303847 parented | -.4226362 .5773327 -0.73 0.464 -1.554187 .708915 city | .4113895 .7146071 0.58 0.565 -.9892146 1.811994 female | 2.21348 .5849222 3.78 0.000 1.067054 3.359906 ------------------------------------------------------------------------------ (jobchoice==1 is the base outcome) . listcoef mlogit (N=165): Factor Change in the Odds of jobchoice Variable: income (sd=1.1324678) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.49569 0.825 0.409 1.6416 1.7530 2 -1 | 0.68435 2.483 0.013 1.9825 2.1706 3 -2 | -0.49569 -0.825 0.409 0.6092 0.5704 3 -1 | 0.18866 0.377 0.706 1.2076 1.2382 1 -2 | -0.68435 -2.483 0.013 0.5044 0.4607 1 -3 | -0.18866 -0.377 0.706 0.8281 0.8076 ---------------------------------------------------------------- Variable: origin (sd=.48875001) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.14257 0.208 0.835 1.1532 1.0722 2 -1 | -0.69086 -1.098 0.272 0.5011 0.7134 3 -2 | -0.14257 -0.208 0.835 0.8671 0.9327 3 -1 | -0.83343 -3.901 0.000 0.4346 0.6654 1 -2 | 0.69086 1.098 0.272 1.9954 1.4017 1 -3 | 0.83343 3.901 0.000 2.3012 1.5028 ---------------------------------------------------------------- Variable: prestigepar (sd=1.0910846) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.78366 -1.933 0.053 0.4567 0.4253 2 -1 | -0.77012 -2.486 0.013 0.4630 0.4316 3 -2 | 0.78366 1.933 0.053 2.1895 2.3515 3 -1 | 0.01353 0.061 0.951 1.0136 1.0149 1 -2 | 0.77012 2.486 0.013 2.1600 2.3170 1 -3 | -0.01353 -0.061 0.951 0.9866 0.9853 ---------------------------------------------------------------- Variable: aptitude (sd=1.1823099) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.17801 0.451 0.652 1.1948 1.2342 2 -1 | -0.08613 -0.311 0.756 0.9175 0.9032 3 -2 | -0.17801 -0.451 0.652 0.8369 0.8102 3 -1 | -0.26414 -0.948 0.343 0.7679 0.7318 1 -2 | 0.08613 0.311 0.756 1.0899 1.1072 1 -3 | 0.26414 0.948 0.343 1.3023 1.3666 ---------------------------------------------------------------- Variable: siblings (sd=1.129069) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.22818 -0.487 0.626 0.7960 0.7729 2 -1 | -0.52515 -1.510 0.131 0.5915 0.5527 3 -2 | 0.22818 0.487 0.626 1.2563 1.2939 3 -1 | -0.29697 -1.095 0.274 0.7431 0.7151 1 -2 | 0.52515 1.510 0.131 1.6907 1.8093 1 -3 | 0.29697 1.095 0.274 1.3458 1.3984 ---------------------------------------------------------------- Variable: friends (sd=1.1223392) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.03022 0.075 0.940 1.0307 1.0345 2 -1 | -0.40392 -1.239 0.215 0.6677 0.6355 3 -2 | -0.03022 -0.075 0.940 0.9702 0.9666 3 -1 | -0.43414 -2.082 0.037 0.6478 0.6143 1 -2 | 0.40392 1.239 0.215 1.4977 1.5735 1 -3 | 0.43414 2.082 0.037 1.5436 1.6278 ---------------------------------------------------------------- Variable: scale1_std (sd=1.3065869) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.33160 0.917 0.359 1.3932 1.5423 2 -1 | 0.68975 2.549 0.011 1.9932 2.4626 3 -2 | -0.33160 -0.917 0.359 0.7178 0.6484 3 -1 | 0.35815 1.611 0.107 1.4307 1.5967 1 -2 | -0.68975 -2.549 0.011 0.5017 0.4061 1 -3 | -0.35815 -1.611 0.107 0.6990 0.6263 ---------------------------------------------------------------- Variable: demands (sd=1.3001791) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.19361 -0.540 0.589 0.8240 0.7775 2 -1 | -0.15282 -0.568 0.570 0.8583 0.8198 3 -2 | 0.19361 0.540 0.589 1.2136 1.2862 3 -1 | 0.04079 0.199 0.843 1.0416 1.0545 1 -2 | 0.15282 0.568 0.570 1.1651 1.2198 1 -3 | -0.04079 -0.199 0.843 0.9600 0.9484 ---------------------------------------------------------------- Variable: interestlvl (sd=1.3375303) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.51699 1.633 0.103 1.6770 1.9967 2 -1 | 0.12130 0.540 0.589 1.1290 1.1761 3 -2 | -0.51699 -1.633 0.103 0.5963 0.5008 3 -1 | -0.39569 -1.758 0.079 0.6732 0.5890 1 -2 | -0.12130 -0.540 0.589 0.8858 0.8502 1 -3 | 0.39569 1.758 0.079 1.4854 1.6977 ---------------------------------------------------------------- Variable: jobgoal (sd=1.1480244) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.01235 -0.031 0.975 0.9877 0.9859 2 -1 | 0.14562 0.522 0.602 1.1568 1.1820 3 -2 | 0.01235 0.031 0.975 1.0124 1.0143 3 -1 | 0.15797 0.571 0.568 1.1711 1.1988 1 -2 | -0.14562 -0.522 0.602 0.8645 0.8461 1 -3 | -0.15797 -0.571 0.568 0.8539 0.8341 ---------------------------------------------------------------- Variable: scale3 (sd=1.3960138) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.36582 -0.881 0.378 0.6936 0.6001 2 -1 | 0.21804 0.632 0.528 1.2436 1.3558 3 -2 | 0.36582 0.881 0.378 1.4417 1.6664 3 -1 | 0.58386 2.572 0.010 1.7929 2.2593 1 -2 | -0.21804 -0.632 0.528 0.8041 0.7376 1 -3 | -0.58386 -2.572 0.010 0.5577 0.4426 ---------------------------------------------------------------- Variable: scale2_std (sd=1.2168745) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.30445 -0.761 0.447 0.7375 0.6904 2 -1 | -0.34536 -1.232 0.218 0.7080 0.6569 3 -2 | 0.30445 0.761 0.447 1.3559 1.4484 3 -1 | -0.04091 -0.154 0.878 0.9599 0.9514 1 -2 | 0.34536 1.232 0.218 1.4125 1.5224 1 -3 | 0.04091 0.154 0.878 1.0418 1.0510 ---------------------------------------------------------------- Variable: motivation (sd=1.3483175) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.05212 -0.078 0.938 0.9492 0.9321 2 -1 | 0.05695 0.179 0.858 1.0586 1.0798 3 -2 | 0.05212 0.078 0.938 1.0535 1.0728 3 -1 | 0.10907 0.189 0.850 1.1152 1.1584 1 -2 | -0.05695 -0.179 0.858 0.9446 0.9261 1 -3 | -0.10907 -0.189 0.850 0.8967 0.8632 ---------------------------------------------------------------- Variable: parented (sd=.48958102) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.88442 0.724 0.469 2.4216 1.5419 2 -1 | 0.46178 0.573 0.566 1.5869 1.2537 3 -2 | -0.88442 -0.724 0.469 0.4130 0.6486 3 -1 | -0.42264 -0.591 0.554 0.6553 0.8131 1 -2 | -0.46178 -0.573 0.566 0.6302 0.7977 1 -3 | 0.42264 0.591 0.554 1.5260 1.2299 ---------------------------------------------------------------- Variable: city (sd=.37650904) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.63578 -0.585 0.558 0.5295 0.7871 2 -1 | -0.22439 -0.233 0.815 0.7990 0.9190 3 -2 | 0.63578 0.585 0.558 1.8885 1.2705 3 -1 | 0.41139 0.703 0.482 1.5089 1.1675 1 -2 | 0.22439 0.233 0.815 1.2516 1.0882 1 -3 | -0.41139 -0.703 0.482 0.6627 0.8565 ---------------------------------------------------------------- Variable: female (sd=.50129175) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -1 | 1.25085 1.758 0.079 3.4933 1.8721 1 -2 | -1.25085 -1.758 0.079 0.2863 0.5342 ---------------------------------------------------------------- . . // #2 . // estimate model with a reformatted command . . mlogit jobchoice income origin prestigepar aptitude siblings friends /// > scale1_std demands interestlvl jobgoal scale3 scale2_std motivation /// > parented city female, noconstant baseoutcome(1) Iteration 0: log likelihood = -181.27103 Iteration 1: log likelihood = -127.22576 Iteration 2: log likelihood = -123.51868 Iteration 3: log likelihood = -123.27821 Iteration 4: log likelihood = -123.27561 Iteration 5: log likelihood = -123.27561 Multinomial logistic regression Number of obs = 165 LR chi2(32) = 115.99 Prob > chi2 = 0.0000 Log likelihood = -123.27561 Pseudo R2 = 0.3199 ------------------------------------------------------------------------------ jobchoice | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 2 | income | .6843505 .2755696 2.48 0.013 .1442439 1.224457 origin | -.6908639 .6291814 -1.10 0.272 -1.924037 .542309 prestigepar | -.7701228 .3097407 -2.49 0.013 -1.377203 -.1630422 aptitude | -.0861294 .27721 -0.31 0.756 -.629451 .4571922 siblings | -.5251506 .3478754 -1.51 0.131 -1.206974 .1566727 friends | -.4039155 .326036 -1.24 0.215 -1.042934 .2351034 scale1_std | .6897483 .2705481 2.55 0.011 .1594838 1.220013 demands | -.1528215 .2692571 -0.57 0.570 -.6805557 .3749128 interestlvl | .1212952 .2247168 0.54 0.589 -.3191415 .561732 jobgoal | .1456197 .2790999 0.52 0.602 -.401406 .6926454 scale3 | .2180359 .3452164 0.63 0.528 -.4585758 .8946476 scale2_std | -.3453626 .2802993 -1.23 0.218 -.8947392 .204014 motivation | .056955 .3188401 0.18 0.858 -.5679601 .6818701 parented | .4617825 .8054585 0.57 0.566 -1.116887 2.040452 city | -.2243885 .9614899 -0.23 0.815 -2.108874 1.660097 female | 1.25085 .711611 1.76 0.079 -.1438821 2.645582 -------------+---------------------------------------------------------------- 3 | income | .188663 .2327753 0.81 0.418 -.2675682 .6448942 origin | -.8334307 .5002381 -1.67 0.096 -1.813879 .147018 prestigepar | .013533 .213663 0.06 0.949 -.4052387 .4323048 aptitude | -.2641383 .2204725 -1.20 0.231 -.6962565 .1679799 siblings | -.2969743 .27851 -1.07 0.286 -.8428439 .2488953 friends | -.4341386 .2712158 -1.60 0.109 -.9657118 .0974346 scale1_std | .3581498 .208518 1.72 0.086 -.0505378 .7668375 demands | .0407877 .2223813 0.18 0.854 -.3950716 .4766469 interestlvl | -.3956904 .2053118 -1.93 0.054 -.7980942 .0067134 jobgoal | .1579681 .2250737 0.70 0.483 -.2831682 .5991044 scale3 | .5838574 .2765891 2.11 0.035 .0417528 1.125962 scale2_std | -.0409103 .2269731 -0.18 0.857 -.4857694 .4039488 motivation | .1090746 .2659794 0.41 0.682 -.4122354 .6303847 parented | -.4226362 .5773327 -0.73 0.464 -1.554187 .708915 city | .4113895 .7146071 0.58 0.565 -.9892146 1.811994 female | 2.21348 .5849222 3.78 0.000 1.067054 3.359906 ------------------------------------------------------------------------------ (jobchoice==1 is the base outcome) . listcoef mlogit (N=165): Factor Change in the Odds of jobchoice Variable: income (sd=1.1324678) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.49569 0.825 0.409 1.6416 1.7530 2 -1 | 0.68435 2.483 0.013 1.9825 2.1706 3 -2 | -0.49569 -0.825 0.409 0.6092 0.5704 3 -1 | 0.18866 0.377 0.706 1.2076 1.2382 1 -2 | -0.68435 -2.483 0.013 0.5044 0.4607 1 -3 | -0.18866 -0.377 0.706 0.8281 0.8076 ---------------------------------------------------------------- Variable: origin (sd=.48875001) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.14257 0.208 0.835 1.1532 1.0722 2 -1 | -0.69086 -1.098 0.272 0.5011 0.7134 3 -2 | -0.14257 -0.208 0.835 0.8671 0.9327 3 -1 | -0.83343 -3.901 0.000 0.4346 0.6654 1 -2 | 0.69086 1.098 0.272 1.9954 1.4017 1 -3 | 0.83343 3.901 0.000 2.3012 1.5028 ---------------------------------------------------------------- Variable: prestigepar (sd=1.0910846) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.78366 -1.933 0.053 0.4567 0.4253 2 -1 | -0.77012 -2.486 0.013 0.4630 0.4316 3 -2 | 0.78366 1.933 0.053 2.1895 2.3515 3 -1 | 0.01353 0.061 0.951 1.0136 1.0149 1 -2 | 0.77012 2.486 0.013 2.1600 2.3170 1 -3 | -0.01353 -0.061 0.951 0.9866 0.9853 ---------------------------------------------------------------- Variable: aptitude (sd=1.1823099) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.17801 0.451 0.652 1.1948 1.2342 2 -1 | -0.08613 -0.311 0.756 0.9175 0.9032 3 -2 | -0.17801 -0.451 0.652 0.8369 0.8102 3 -1 | -0.26414 -0.948 0.343 0.7679 0.7318 1 -2 | 0.08613 0.311 0.756 1.0899 1.1072 1 -3 | 0.26414 0.948 0.343 1.3023 1.3666 ---------------------------------------------------------------- Variable: siblings (sd=1.129069) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.22818 -0.487 0.626 0.7960 0.7729 2 -1 | -0.52515 -1.510 0.131 0.5915 0.5527 3 -2 | 0.22818 0.487 0.626 1.2563 1.2939 3 -1 | -0.29697 -1.095 0.274 0.7431 0.7151 1 -2 | 0.52515 1.510 0.131 1.6907 1.8093 1 -3 | 0.29697 1.095 0.274 1.3458 1.3984 ---------------------------------------------------------------- Variable: friends (sd=1.1223392) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.03022 0.075 0.940 1.0307 1.0345 2 -1 | -0.40392 -1.239 0.215 0.6677 0.6355 3 -2 | -0.03022 -0.075 0.940 0.9702 0.9666 3 -1 | -0.43414 -2.082 0.037 0.6478 0.6143 1 -2 | 0.40392 1.239 0.215 1.4977 1.5735 1 -3 | 0.43414 2.082 0.037 1.5436 1.6278 ---------------------------------------------------------------- Variable: scale1_std (sd=1.3065869) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.33160 0.917 0.359 1.3932 1.5423 2 -1 | 0.68975 2.549 0.011 1.9932 2.4626 3 -2 | -0.33160 -0.917 0.359 0.7178 0.6484 3 -1 | 0.35815 1.611 0.107 1.4307 1.5967 1 -2 | -0.68975 -2.549 0.011 0.5017 0.4061 1 -3 | -0.35815 -1.611 0.107 0.6990 0.6263 ---------------------------------------------------------------- Variable: demands (sd=1.3001791) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.19361 -0.540 0.589 0.8240 0.7775 2 -1 | -0.15282 -0.568 0.570 0.8583 0.8198 3 -2 | 0.19361 0.540 0.589 1.2136 1.2862 3 -1 | 0.04079 0.199 0.843 1.0416 1.0545 1 -2 | 0.15282 0.568 0.570 1.1651 1.2198 1 -3 | -0.04079 -0.199 0.843 0.9600 0.9484 ---------------------------------------------------------------- Variable: interestlvl (sd=1.3375303) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.51699 1.633 0.103 1.6770 1.9967 2 -1 | 0.12130 0.540 0.589 1.1290 1.1761 3 -2 | -0.51699 -1.633 0.103 0.5963 0.5008 3 -1 | -0.39569 -1.758 0.079 0.6732 0.5890 1 -2 | -0.12130 -0.540 0.589 0.8858 0.8502 1 -3 | 0.39569 1.758 0.079 1.4854 1.6977 ---------------------------------------------------------------- Variable: jobgoal (sd=1.1480244) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.01235 -0.031 0.975 0.9877 0.9859 2 -1 | 0.14562 0.522 0.602 1.1568 1.1820 3 -2 | 0.01235 0.031 0.975 1.0124 1.0143 3 -1 | 0.15797 0.571 0.568 1.1711 1.1988 1 -2 | -0.14562 -0.522 0.602 0.8645 0.8461 1 -3 | -0.15797 -0.571 0.568 0.8539 0.8341 ---------------------------------------------------------------- Variable: scale3 (sd=1.3960138) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.36582 -0.881 0.378 0.6936 0.6001 2 -1 | 0.21804 0.632 0.528 1.2436 1.3558 3 -2 | 0.36582 0.881 0.378 1.4417 1.6664 3 -1 | 0.58386 2.572 0.010 1.7929 2.2593 1 -2 | -0.21804 -0.632 0.528 0.8041 0.7376 1 -3 | -0.58386 -2.572 0.010 0.5577 0.4426 ---------------------------------------------------------------- Variable: scale2_std (sd=1.2168745) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.30445 -0.761 0.447 0.7375 0.6904 2 -1 | -0.34536 -1.232 0.218 0.7080 0.6569 3 -2 | 0.30445 0.761 0.447 1.3559 1.4484 3 -1 | -0.04091 -0.154 0.878 0.9599 0.9514 1 -2 | 0.34536 1.232 0.218 1.4125 1.5224 1 -3 | 0.04091 0.154 0.878 1.0418 1.0510 ---------------------------------------------------------------- Variable: motivation (sd=1.3483175) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.05212 -0.078 0.938 0.9492 0.9321 2 -1 | 0.05695 0.179 0.858 1.0586 1.0798 3 -2 | 0.05212 0.078 0.938 1.0535 1.0728 3 -1 | 0.10907 0.189 0.850 1.1152 1.1584 1 -2 | -0.05695 -0.179 0.858 0.9446 0.9261 1 -3 | -0.10907 -0.189 0.850 0.8967 0.8632 ---------------------------------------------------------------- Variable: parented (sd=.48958102) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | 0.88442 0.724 0.469 2.4216 1.5419 2 -1 | 0.46178 0.573 0.566 1.5869 1.2537 3 -2 | -0.88442 -0.724 0.469 0.4130 0.6486 3 -1 | -0.42264 -0.591 0.554 0.6553 0.8131 1 -2 | -0.46178 -0.573 0.566 0.6302 0.7977 1 -3 | 0.42264 0.591 0.554 1.5260 1.2299 ---------------------------------------------------------------- Variable: city (sd=.37650904) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -3 | -0.63578 -0.585 0.558 0.5295 0.7871 2 -1 | -0.22439 -0.233 0.815 0.7990 0.9190 3 -2 | 0.63578 0.585 0.558 1.8885 1.2705 3 -1 | 0.41139 0.703 0.482 1.5089 1.1675 1 -2 | 0.22439 0.233 0.815 1.2516 1.0882 1 -3 | -0.41139 -0.703 0.482 0.6627 0.8565 ---------------------------------------------------------------- Variable: female (sd=.50129175) Odds comparing | Alternative 1 | to Alternative 2 | b z P>|z| e^b e^bStdX ------------------+--------------------------------------------- 2 -1 | 1.25085 1.758 0.079 3.4933 1.8721 1 -2 | -1.25085 -1.758 0.079 0.2863 0.5342 ---------------------------------------------------------------- . . log close log: D:\wf\work\wf3-longcommand.log log type: text closed on: 24 Oct 2008, 09:40:50 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- . exit end of do-file . do wf3-longoutputlines.do . capture log close . log using wf3-longoutputlines, replace text (note: file D:\wf\work\wf3-longoutputlines.log not found) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- log: D:\wf\work\wf3-longoutputlines.log log type: text opened on: 24 Oct 2008, 09:40:50 . . // program: wf3-longoutputlines.do \ for stata 9 . // task: problem with long output lines . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #1 . // load data . . use wf-occupation, clear (Workflow data on occupational attainment \ 2008-04-02) . . // #2 . // long line problem when using linesize 132 . . set linesize 132 . tabulate occ ed, row +----------------+ | Key | |----------------| | frequency | | row percentage | +----------------+ | Years of education Occupation | 3 6 7 8 9 10 11 12 13 | Total -----------+---------------------------------------------------------------------------------------------------+---------- Menial | 0 2 0 0 3 1 3 12 2 | 31 | 0.00 6.45 0.00 0.00 9.68 3.23 9.68 38.71 6.45 | 100.00 -----------+---------------------------------------------------------------------------------------------------+---------- BlueCol | 1 3 1 7 4 6 5 26 7 | 69 | 1.45 4.35 1.45 10.14 5.80 8.70 7.25 37.68 10.14 | 100.00 -----------+---------------------------------------------------------------------------------------------------+---------- Craft | 0 3 2 3 2 2 7 39 7 | 84 | 0.00 3.57 2.38 3.57 2.38 2.38 8.33 46.43 8.33 | 100.00 -----------+---------------------------------------------------------------------------------------------------+---------- WhiteCol | 0 0 0 1 0 1 2 19 4 | 41 | 0.00 0.00 0.00 2.44 0.00 2.44 4.88 46.34 9.76 | 100.00 -----------+---------------------------------------------------------------------------------------------------+---------- Prof | 0 0 1 1 0 0 2 13 10 | 112 | 0.00 0.00 0.89 0.89 0.00 0.00 1.79 11.61 8.93 | 100.00 -----------+---------------------------------------------------------------------------------------------------+---------- Total | 1 8 4 12 9 10 19 109 30 | 337 | 0.30 2.37 1.19 3.56 2.67 2.97 5.64 32.34 8.90 | 100.00 | Years of education Occupation | 14 15 16 17 18 19 20 | Total -----------+-----------------------------------------------------------------------------+---------- Menial | 7 1 0 0 0 0 0 | 31 | 22.58 3.23 0.00 0.00 0.00 0.00 0.00 | 100.00 -----------+-----------------------------------------------------------------------------+---------- BlueCol | 3 2 4 0 0 0 0 | 69 | 4.35 2.90 5.80 0.00 0.00 0.00 0.00 | 100.00 -----------+-----------------------------------------------------------------------------+---------- Craft | 13 3 3 0 0 0 0 | 84 | 15.48 3.57 3.57 0.00 0.00 0.00 0.00 | 100.00 -----------+-----------------------------------------------------------------------------+---------- WhiteCol | 1 5 6 1 1 0 0 | 41 | 2.44 12.20 14.63 2.44 2.44 0.00 0.00 | 100.00 -----------+-----------------------------------------------------------------------------+---------- Prof | 14 7 32 6 8 13 5 | 112 | 12.50 6.25 28.57 5.36 7.14 11.61 4.46 | 100.00 -----------+-----------------------------------------------------------------------------+---------- Total | 38 18 45 7 9 13 5 | 337 | 11.28 5.34 13.35 2.08 2.67 3.86 1.48 | 100.00 . . // #3 . // no problem with linesize 80 . . set linesize 80 . tabulate occ ed, row +----------------+ | Key | |----------------| | frequency | | row percentage | +----------------+ | Years of education Occupation | 3 6 7 8 9 | Total -----------+-------------------------------------------------------+---------- Menial | 0 2 0 0 3 | 31 | 0.00 6.45 0.00 0.00 9.68 | 100.00 -----------+-------------------------------------------------------+---------- BlueCol | 1 3 1 7 4 | 69 | 1.45 4.35 1.45 10.14 5.80 | 100.00 -----------+-------------------------------------------------------+---------- Craft | 0 3 2 3 2 | 84 | 0.00 3.57 2.38 3.57 2.38 | 100.00 -----------+-------------------------------------------------------+---------- WhiteCol | 0 0 0 1 0 | 41 | 0.00 0.00 0.00 2.44 0.00 | 100.00 -----------+-------------------------------------------------------+---------- Prof | 0 0 1 1 0 | 112 | 0.00 0.00 0.89 0.89 0.00 | 100.00 -----------+-------------------------------------------------------+---------- Total | 1 8 4 12 9 | 337 | 0.30 2.37 1.19 3.56 2.67 | 100.00 | Years of education Occupation | 10 11 12 13 14 | Total -----------+-------------------------------------------------------+---------- Menial | 1 3 12 2 7 | 31 | 3.23 9.68 38.71 6.45 22.58 | 100.00 -----------+-------------------------------------------------------+---------- BlueCol | 6 5 26 7 3 | 69 | 8.70 7.25 37.68 10.14 4.35 | 100.00 -----------+-------------------------------------------------------+---------- Craft | 2 7 39 7 13 | 84 | 2.38 8.33 46.43 8.33 15.48 | 100.00 -----------+-------------------------------------------------------+---------- WhiteCol | 1 2 19 4 1 | 41 | 2.44 4.88 46.34 9.76 2.44 | 100.00 -----------+-------------------------------------------------------+---------- Prof | 0 2 13 10 14 | 112 | 0.00 1.79 11.61 8.93 12.50 | 100.00 -----------+-------------------------------------------------------+---------- Total | 10 19 109 30 38 | 337 | 2.97 5.64 32.34 8.90 11.28 | 100.00 | Years of education Occupation | 15 16 17 18 19 | Total -----------+-------------------------------------------------------+---------- Menial | 1 0 0 0 0 | 31 | 3.23 0.00 0.00 0.00 0.00 | 100.00 -----------+-------------------------------------------------------+---------- BlueCol | 2 4 0 0 0 | 69 | 2.90 5.80 0.00 0.00 0.00 | 100.00 -----------+-------------------------------------------------------+---------- Craft | 3 3 0 0 0 | 84 | 3.57 3.57 0.00 0.00 0.00 | 100.00 -----------+-------------------------------------------------------+---------- WhiteCol | 5 6 1 1 0 | 41 | 12.20 14.63 2.44 2.44 0.00 | 100.00 -----------+-------------------------------------------------------+---------- Prof | 7 32 6 8 13 | 112 | 6.25 28.57 5.36 7.14 11.61 | 100.00 -----------+-------------------------------------------------------+---------- Total | 18 45 7 9 13 | 337 | 5.34 13.35 2.08 2.67 3.86 | 100.00 | Years of | education Occupation | 20 | Total -----------+-----------+---------- Menial | 0 | 31 | 0.00 | 100.00 -----------+-----------+---------- BlueCol | 0 | 69 | 0.00 | 100.00 -----------+-----------+---------- Craft | 0 | 84 | 0.00 | 100.00 -----------+-----------+---------- WhiteCol | 0 | 41 | 0.00 | 100.00 -----------+-----------+---------- Prof | 5 | 112 | 4.46 | 100.00 -----------+-----------+---------- Total | 5 | 337 | 1.48 | 100.00 . . log close log: D:\wf\work\wf3-longoutputlines.log log type: text closed on: 24 Oct 2008, 09:40:51 -------------------------------------------------------------------------------- . exit end of do-file . . * do-file templates . do wf3-example.do . capture log close . log using wf3-example, replace text (note: file D:\wf\work\wf3-example.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-example.log log type: text opened on: 24 Oct 2008, 09:40:51 . . // wf3-example.do: compute descriptive statistics \ for stata 9 . // scott long 23Oct2008 . . version 9.2 . clear // changed to clear all in stata 10 . macro drop _all . set linesize 80 . . * load the data and check descriptive statistics . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 k5 | 753 .2377158 .523959 0 3 k618 | 753 1.353254 1.319874 0 8 age | 753 42.53785 8.072574 30 60 wc | 753 .2815405 .4500494 0 1 -------------+-------------------------------------------------------- hc | 753 .3917663 .4884694 0 1 lwg | 753 1.097115 .5875564 -2.054124 3.218876 inc | 753 20.12897 11.6348 -.0290001 96 . . log close log: D:\wf\work\wf3-example.log log type: text closed on: 24 Oct 2008, 09:40:51 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-simple.do . capture log close . log using _name_, replace text (note: file D:\wf\work\_name_.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\_name_.log log type: text opened on: 24 Oct 2008, 09:40:51 . . // _name_.do \ for stata 9: . // scott long _date_ . . version 9.2 . clear // changed to clear all in stata 10 . macro drop _all . set linesize 80 . . * . . log close log: D:\wf\work\_name_.log log type: text closed on: 24 Oct 2008, 09:40:51 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-complex.do . capture log close . log using _name_, replace text -------------------------------------------------------------------------------- log: D:\wf\work\_name_.log log type: text opened on: 24 Oct 2008, 09:40:51 . . // program: _name_.do \ for stata 9 . // task: . // project: . // author: _who_ \ _date_ . . // #0 . // program setup . . version 9.2 . clear // changed to clear all in stata 10 . set linesize 80 . macro drop _all . . // #1 . // describe task 1 . . // #2 . // describe task 2 . . log close log: D:\wf\work\_name_.log log type: text closed on: 24 Oct 2008, 09:40:51 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-subsample.do . capture log close . log using wf3-subsample, replace text (note: file D:\wf\work\wf3-subsample.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-subsample.log log type: text opened on: 24 Oct 2008, 09:40:51 . . // program: wf3-subsample.do \ for stata 9 . // task: debugging with a small sample . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . . // #2 . // create a small random sample . . set seed 11020 . generate isin = (uniform()>.8) // changed go runiform in stata 10 . label var isin "1 if in random sample (seed 11020)" . label def isin 0 0_NoIn 1 1_InSample . label val isin isin . keep if isin (601 observations deleted) . tabulate isin, missing 1 if in | random | sample | (seed | 11020) | Freq. Percent Cum. ------------+----------------------------------- 1_InSample | 152 100.00 100.00 ------------+----------------------------------- Total | 152 100.00 . . // #3 . // save the subsample . . label data "20% subsample of wf-lfp." . note: wf3-subsample.do jsl 2008-10-24. . save x-wf3-subsample, replace (note: file x-wf3-subsample.dta not found) file x-wf3-subsample.dta saved . . // #4 . // try command using subsample . . logit lfp k5 k618 age wc hc lwg inc Iteration 0: log likelihood = -105.23992 Iteration 1: log likelihood = -88.732344 Iteration 2: log likelihood = -87.429901 Iteration 3: log likelihood = -87.374666 Iteration 4: log likelihood = -87.374542 Logistic regression Number of obs = 152 LR chi2(7) = 35.73 Prob > chi2 = 0.0000 Log likelihood = -87.374542 Pseudo R2 = 0.1698 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.423554 .467847 -3.04 0.002 -2.340517 -.5065907 k618 | .0208013 .1398516 0.15 0.882 -.2533028 .2949053 age | -.0239279 .0284528 -0.84 0.400 -.0796945 .0318386 wc | 1.583129 .5709745 2.77 0.006 .4640392 2.702218 hc | -.1874739 .4668219 -0.40 0.688 -1.102428 .7274802 lwg | .6308349 .3473962 1.82 0.069 -.0500491 1.311719 inc | -.0595849 .0229887 -2.59 0.010 -.1046419 -.014528 _cons | 1.54916 1.439105 1.08 0.282 -1.271434 4.369755 ------------------------------------------------------------------------------ . . log close log: D:\wf\work\wf3-subsample.log log type: text closed on: 24 Oct 2008, 09:40:51 -------------------------------------------------------------------------------- . exit end of do-file . . * debugging graphs . do wf3-debug-graph1.do, nostop . capture log close . log using wf3-debug-graph1, replace text (note: file D:\wf\work\wf3-debug-graph1.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph1.log log type: text opened on: 24 Oct 2008, 09:40:51 . . // program: wf3-debug-graph1.do \ for stata 9 . // task: debugging example of graph command . // step 1 - full command generating 198 error . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /// > ytitle(Where do you work?) yscale(range(1 5.)) ylabel(1(1)5, angle(ninety) > ) xtitle(Where did you graduate?) /// > xscale(range(1 5)) xlabel(1,5) caption(wf3-debug-graph1.do 2008-10-24, siz > e(small)) scheme(s2manual) aspectratio(1) by(fem) option 5 not allowed r(198); . . log close log: D:\wf\work\wf3-debug-graph1.log log type: text closed on: 24 Oct 2008, 09:40:52 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph2.do, nostop . capture log close . log using wf3-debug-graph2, replace text (note: file D:\wf\work\wf3-debug-graph2.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph2.log log type: text opened on: 24 Oct 2008, 09:40:52 . . // program: wf3-debug-graph2.do \ for stata 9 . // task: debugging example of graph command . // step 2 - reformat command to make it easier to read . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /// > ytitle(Where do you work?) yscale(range(1 5.)) /// > ylabel(1(1)5, angle(ninety)) /// > xtitle(Where did you graduate?) xscale(range(1 5)) xlabel(1,5) /// > caption(wf3-debug-graph2.do 2008-10-24, size(small)) /// > scheme(s2manual) aspectratio(1) by(fem) option 5 not allowed r(198); . . log close log: D:\wf\work\wf3-debug-graph2.log log type: text closed on: 24 Oct 2008, 09:40:53 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph3.do, nostop . capture log close . log using wf3-debug-graph3, replace text (note: file D:\wf\work\wf3-debug-graph3.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph3.log log type: text opened on: 24 Oct 2008, 09:40:53 . . // program: wf3-debug-graph3.do \ for stata 9 . // task: debugging example of graph command . // step 3 - test the variables being plotted . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create scatterplot . . scatter job phd . . log close log: D:\wf\work\wf3-debug-graph3.log log type: text closed on: 24 Oct 2008, 09:40:54 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph4.do, nostop . capture log close . log using wf3-debug-graph4, replace text (note: file D:\wf\work\wf3-debug-graph4.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph4.log log type: text opened on: 24 Oct 2008, 09:40:54 . . // program: wf3-debug-graph4.do \ for stata 9 . // task: debugging example of graph command . // step 4 - only run first line of command . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /* /// > ytitle(Where do you work?) yscale(range(1 5.)) /// > ylabel(1(1)5, angle(ninety)) /// > xtitle(Where did you graduate?) xscale(range(1 5)) xlabel(1,5) /// > caption(wf3-debug-graph4.do 2008-10-24, size(small)) /// > scheme(s2manual) aspectratio(1) by(fem) */ . . log close log: D:\wf\work\wf3-debug-graph4.log log type: text closed on: 24 Oct 2008, 09:40:56 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph5.do, nostop . capture log close . log using wf3-debug-graph5, replace text (note: file D:\wf\work\wf3-debug-graph5.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph5.log log type: text opened on: 24 Oct 2008, 09:40:56 . . // program: wf3-debug-graph5.do \ for stata 9 . // task: debugging example of graph command . // step 5 - only run first line of command . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /// > ytitle(Where do you work?) yscale(range(1 5.)) /// > ylabel(1(1)5, angle(ninety)) /* /// > xtitle(Where did you graduate?) xscale(range(1 5)) xlabel(1,5) /// > caption(wf3-debug-graph5.do 2008-10-24, size(small)) /// > scheme(s2manual) aspectratio(1) by(fem) */ . . log close log: D:\wf\work\wf3-debug-graph5.log log type: text closed on: 24 Oct 2008, 09:40:57 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph6.do, nostop . capture log close . log using wf3-debug-graph6, replace text (note: file D:\wf\work\wf3-debug-graph6.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph6.log log type: text opened on: 24 Oct 2008, 09:40:57 . . // program: wf3-debug-graph6.do \ for stata 9 . // task: debugging example of graph command . // step 6 - add commands for x-axis . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /// > ytitle(Where do you work?) yscale(range(1 5.)) /// > ylabel(1(1)5, angle(ninety)) /// > xtitle(Where did you graduate?) xscale(range(1 5)) xlabel(1,5) /* /// > caption(wf3-debug-graph6.do 2008-10-24, size(small)) /// > scheme(s2manual) aspectratio(1) by(fem) */ option 5 not allowed r(198); . . log close log: D:\wf\work\wf3-debug-graph6.log log type: text closed on: 24 Oct 2008, 09:40:58 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph7.do, nostop . capture log close . log using wf3-debug-graph7, replace text (note: file D:\wf\work\wf3-debug-graph7.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph7.log log type: text opened on: 24 Oct 2008, 09:40:58 . . // program: wf3-debug-graph7.do \ for stata 9 . // task: debugging example of graph command . // step 7 - change xlabel(1,5) to xlabel(1(1)5) . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /// > ytitle(Where do you work?) yscale(range(1 5.)) /// > ylabel(1(1)5, angle(ninety)) /// > xtitle(Where did you graduate?) xscale(range(1 5)) xlabel(1(1)5) /// > caption(wf3-debug-graph7.do 2008-10-24, size(small)) /// > scheme(s2manual) aspectratio(1) by(fem) . . log close log: D:\wf\work\wf3-debug-graph7.log log type: text closed on: 24 Oct 2008, 09:41:00 -------------------------------------------------------------------------------- . exit end of do-file . do wf3-debug-graph8.do, nostop . capture log close . log using wf3-debug-graph8, replace text (note: file D:\wf\work\wf3-debug-graph8.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-graph8.log log type: text opened on: 24 Oct 2008, 09:41:00 . . // program: wf3-debug-graph8.do \ for stata 9 . // task: debugging example of graph command . // step 8 - add xtitle and xscale . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . . // #2 . // create graph . . twoway (scatter job phd, msymbol(smcircle_hollow) msize(small)), /// > ytitle(Where do you work?) yscale(range(1 5.)) /// > ylabel(1(1)5, angle(ninety)) /// > xtitle(Where did you graduate?) xscale(range(1 5)) /* xlabel(1,5) /// > caption(wf3-debug-graph8.do 2008-10-24, size(small)) /// > scheme(s2manual) aspectratio(1) by(fem) */ . . log close log: D:\wf\work\wf3-debug-graph8.log log type: text closed on: 24 Oct 2008, 09:41:01 -------------------------------------------------------------------------------- . exit end of do-file . . * debugging unanticipated problems . do wf3-debug-precision.do . capture log close . log using wf3-debug-precision, replace text (note: file D:\wf\work\wf3-debug-precision.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf3-debug-precision.log log type: text opened on: 24 Oct 2008, 09:41:01 . . // program: wf3-debug-precision.do \ for stata 9 . // task: error caused by insufficient precision . // project: workflow - chapter 3 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and look at basic information . . use wf-flims, clear (Workflow data on functional limitations \ 2008-04-02) . summarize hnd hvy lft rch sit std stp str wlk Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- hnd | 1644 .169708 .3754903 0 1 hvy | 1644 .4288321 .4950598 0 1 lft | 1644 .2475669 .4317301 0 1 rch | 1644 .1703163 .3760248 0 1 sit | 1644 .2104623 .407761 0 1 -------------+-------------------------------------------------------- std | 1644 .3607056 .4803514 0 1 stp | 1644 .3643552 .4813953 0 1 str | 1644 .2974453 .4572732 0 1 wlk | 1644 .2706813 .4444469 0 1 . . // #2 . // combining two variables . . generate strwlk = 10*str + wlk . tabulate strwlk, missing strwlk | Freq. Percent Cum. ------------+----------------------------------- 0 | 1,091 66.36 66.36 1 | 64 3.89 70.26 10 | 108 6.57 76.82 11 | 381 23.18 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . // #2 . // create variable containing all limitations . . generate flimall = hnd*100000000 + hvy*10000000 + lft*1000000 /// > + rch*100000 + sit*10000 + std*1000 + stp*100 + str*10 + wlk . label var flimall "hnd-hvy-lft-rch-sit-std-stp-str-wlk" . tabulate flimall, missing hnd-hvy-lft | -rch-sit-st | d-stp-str-w | lk | Freq. Percent Cum. ------------+----------------------------------- 0 | 715 43.49 43.49 1 | 5 0.30 43.80 10 | 8 0.49 44.28 11 | 2 0.12 44.40 100 | 28 1.70 46.11 101 | 1 0.06 46.17 110 | 7 0.43 46.59 111 | 1 0.06 46.65 1000 | 26 1.58 48.24 1001 | 5 0.30 48.54 1010 | 3 0.18 48.72 1011 | 1 0.06 48.78 1100 | 11 0.67 49.45 1101 | 1 0.06 49.51 1110 | 6 0.36 49.88 1111 | 5 0.30 50.18 10000 | 14 0.85 51.03 10010 | 1 0.06 51.09 10100 | 2 0.12 51.22 10101 | 1 0.06 51.28 10110 | 1 0.06 51.34 11000 | 4 0.24 51.58 11001 | 1 0.06 51.64 11011 | 1 0.06 51.70 11100 | 4 0.24 51.95 11101 | 1 0.06 52.01 11110 | 1 0.06 52.07 11111 | 6 0.36 52.43 100000 | 2 0.12 52.55 100010 | 1 0.06 52.62 100101 | 1 0.06 52.68 101001 | 1 0.06 52.74 111000 | 1 0.06 52.80 111111 | 1 0.06 52.86 1000000 | 1 0.06 52.92 1000001 | 1 0.06 52.98 1000010 | 3 0.18 53.16 1000011 | 1 0.06 53.22 1000100 | 1 0.06 53.28 1000110 | 1 0.06 53.35 1000111 | 1 0.06 53.41 1001000 | 1 0.06 53.47 1001001 | 1 0.06 53.53 1001100 | 2 0.12 53.65 1001101 | 1 0.06 53.71 1001110 | 1 0.06 53.77 1001111 | 2 0.12 53.89 1010111 | 1 0.06 53.95 1011000 | 1 0.06 54.01 1100111 | 1 0.06 54.08 1101100 | 1 0.06 54.14 1.00e+07 | 86 5.23 59.37 1.00e+07 | 4 0.24 59.61 1.00e+07 | 5 0.30 59.91 1.00e+07 | 5 0.30 60.22 1.00e+07 | 26 1.58 61.80 1.00e+07 | 1 0.06 61.86 1.00e+07 | 5 0.30 62.17 1.00e+07 | 4 0.24 62.41 1.00e+07 | 26 1.58 63.99 1.00e+07 | 2 0.12 64.11 1.00e+07 | 3 0.18 64.29 1.00e+07 | 6 0.36 64.66 1.00e+07 | 16 0.97 65.63 1.00e+07 | 7 0.43 66.06 1.00e+07 | 9 0.55 66.61 1.00e+07 | 16 0.97 67.58 1.00e+07 | 5 0.30 67.88 1.00e+07 | 2 0.12 68.00 1.00e+07 | 4 0.24 68.25 1.00e+07 | 2 0.12 68.37 1.00e+07 | 1 0.06 68.43 1.00e+07 | 6 0.36 68.80 1.00e+07 | 3 0.18 68.98 1.00e+07 | 8 0.49 69.46 1.00e+07 | 4 0.24 69.71 1.00e+07 | 6 0.36 70.07 1.01e+07 | 2 0.12 70.19 1.01e+07 | 1 0.06 70.26 1.01e+07 | 1 0.06 70.32 1.01e+07 | 2 0.12 70.44 1.01e+07 | 2 0.12 70.56 1.01e+07 | 1 0.06 70.62 1.01e+07 | 3 0.18 70.80 1.01e+07 | 3 0.18 70.99 1.01e+07 | 1 0.06 71.05 1.01e+07 | 1 0.06 71.11 1.01e+07 | 1 0.06 71.17 1.01e+07 | 1 0.06 71.23 1.01e+07 | 1 0.06 71.29 1.01e+07 | 2 0.12 71.41 1.10e+07 | 15 0.91 72.32 1.10e+07 | 1 0.06 72.38 1.10e+07 | 5 0.30 72.69 1.10e+07 | 2 0.12 72.81 1.10e+07 | 6 0.36 73.18 1.10e+07 | 1 0.06 73.24 1.10e+07 | 3 0.18 73.42 1.10e+07 | 4 0.24 73.66 1.10e+07 | 3 0.18 73.84 1.10e+07 | 3 0.18 74.03 1.10e+07 | 5 0.30 74.33 1.10e+07 | 4 0.24 74.57 1.10e+07 | 6 0.36 74.94 1.10e+07 | 6 0.36 75.30 1.10e+07 | 21 1.28 76.58 1.10e+07 | 1 0.06 76.64 1.10e+07 | 2 0.12 76.76 1.10e+07 | 1 0.06 76.82 1.10e+07 | 1 0.06 76.89 1.10e+07 | 6 0.36 77.25 1.10e+07 | 1 0.06 77.31 1.10e+07 | 29 1.76 79.08 1.11e+07 | 1 0.06 79.14 1.11e+07 | 1 0.06 79.20 1.11e+07 | 1 0.06 79.26 1.11e+07 | 4 0.24 79.50 1.11e+07 | 2 0.12 79.62 1.11e+07 | 2 0.12 79.74 1.11e+07 | 1 0.06 79.81 1.11e+07 | 2 0.12 79.93 1.11e+07 | 3 0.18 80.11 1.11e+07 | 13 0.79 80.90 1.11e+07 | 1 0.06 80.96 1.11e+07 | 1 0.06 81.02 1.11e+07 | 1 0.06 81.08 1.11e+07 | 1 0.06 81.14 1.11e+07 | 1 0.06 81.20 1.11e+07 | 1 0.06 81.27 1.11e+07 | 2 0.12 81.39 1.11e+07 | 3 0.18 81.57 1.11e+07 | 24 1.46 83.03 1.00e+08 | 15 0.91 83.94 1.00e+08 | 4 0.24 84.18 1.00e+08 | 1 0.06 84.25 1.00e+08 | 1 0.06 84.31 1.00e+08 | 1 0.06 84.37 1.00e+08 | 3 0.18 84.55 1.00e+08 | 3 0.18 84.73 1.00e+08 | 3 0.18 84.91 1.00e+08 | 1 0.06 84.98 1.00e+08 | 1 0.06 85.04 1.00e+08 | 2 0.12 85.16 1.00e+08 | 2 0.12 85.28 1.00e+08 | 1 0.06 85.34 1.00e+08 | 2 0.12 85.46 1.00e+08 | 2 0.12 85.58 1.01e+08 | 1 0.06 85.64 1.01e+08 | 1 0.06 85.71 1.01e+08 | 2 0.12 85.83 1.01e+08 | 1 0.06 85.89 1.01e+08 | 1 0.06 85.95 1.01e+08 | 1 0.06 86.01 1.10e+08 | 3 0.18 86.19 1.10e+08 | 3 0.18 86.37 1.10e+08 | 1 0.06 86.44 1.10e+08 | 1 0.06 86.50 1.10e+08 | 3 0.18 86.68 1.10e+08 | 9 0.55 87.23 1.10e+08 | 1 0.06 87.29 1.10e+08 | 1 0.06 87.35 1.10e+08 | 1 0.06 87.41 1.10e+08 | 1 0.06 87.47 1.10e+08 | 4 0.24 87.71 1.10e+08 | 1 0.06 87.77 1.10e+08 | 1 0.06 87.83 1.10e+08 | 1 0.06 87.90 1.10e+08 | 1 0.06 87.96 1.10e+08 | 2 0.12 88.08 1.10e+08 | 1 0.06 88.14 1.10e+08 | 7 0.43 88.56 1.11e+08 | 5 0.30 88.87 1.11e+08 | 2 0.12 88.99 1.11e+08 | 2 0.12 89.11 1.11e+08 | 1 0.06 89.17 1.11e+08 | 1 0.06 89.23 1.11e+08 | 10 0.61 89.84 1.11e+08 | 10 0.61 90.45 1.11e+08 | 1 0.06 90.51 1.11e+08 | 15 0.91 91.42 1.11e+08 | 4 0.24 91.67 1.11e+08 | 137 8.33 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . // #3 . // create a string version of all limitations . . generate sflimall=string(flimall, "%16.0f") . label var sflimall "hnd-hvy-lft-rch-sit-std-stp-str-wlk" . tabulate sflimall, missing hnd-hvy-lft | -rch-sit-st | d-stp-str-w | lk | Freq. Percent Cum. ------------+----------------------------------- 0 | 715 43.49 43.49 1 | 5 0.30 43.80 10 | 8 0.49 44.28 100 | 28 1.70 45.99 1000 | 26 1.58 47.57 10000 | 14 0.85 48.42 100000 | 2 0.12 48.54 1000000 | 1 0.06 48.60 10000000 | 86 5.23 53.83 100000000 | 15 0.91 54.74 10000001 | 4 0.24 54.99 100000096 | 4 0.24 55.23 1000001 | 1 0.06 55.29 10000010 | 5 0.30 55.60 10000011 | 5 0.30 55.90 100000112 | 1 0.06 55.96 1000010 | 3 0.18 56.14 10000100 | 26 1.58 57.73 100001000 | 1 0.06 57.79 100001008 | 1 0.06 57.85 10000101 | 1 0.06 57.91 1000011 | 1 0.06 57.97 10000110 | 5 0.30 58.27 100001104 | 3 0.18 58.45 10000111 | 4 0.24 58.70 100001112 | 3 0.18 58.88 100010 | 1 0.06 58.94 1000100 | 1 0.06 59.00 10001000 | 26 1.58 60.58 100010000 | 3 0.18 60.77 10001001 | 2 0.12 60.89 10001010 | 3 0.18 61.07 10001011 | 6 0.36 61.44 1000110 | 1 0.06 61.50 10001100 | 16 0.97 62.47 10001101 | 7 0.43 62.90 1000111 | 1 0.06 62.96 10001110 | 9 0.55 63.50 10001111 | 16 0.97 64.48 100011112 | 1 0.06 64.54 1001 | 5 0.30 64.84 10010 | 1 0.06 64.90 1001000 | 1 0.06 64.96 10010000 | 5 0.30 65.27 10010001 | 2 0.12 65.39 1001001 | 1 0.06 65.45 100101 | 1 0.06 65.51 10010100 | 4 0.24 65.75 100101000 | 1 0.06 65.82 10010110 | 2 0.12 65.94 100101104 | 2 0.12 66.06 10010111 | 1 0.06 66.12 100101112 | 2 0.12 66.24 1001100 | 2 0.12 66.36 10011000 | 6 0.36 66.73 1001101 | 1 0.06 66.79 10011011 | 3 0.18 66.97 1001110 | 1 0.06 67.03 10011100 | 8 0.49 67.52 100111008 | 1 0.06 67.58 1001111 | 2 0.12 67.70 10011110 | 4 0.24 67.94 100111104 | 2 0.12 68.07 10011111 | 6 0.36 68.43 100111112 | 2 0.12 68.55 101 | 1 0.06 68.61 1010 | 3 0.18 68.80 10100 | 2 0.12 68.92 10100000 | 2 0.12 69.04 101001 | 1 0.06 69.10 10100100 | 1 0.06 69.16 101001000 | 1 0.06 69.22 101001104 | 1 0.06 69.28 10100111 | 1 0.06 69.34 101001112 | 2 0.12 69.46 10101 | 1 0.06 69.53 10101011 | 2 0.12 69.65 10101100 | 2 0.12 69.77 10101101 | 1 0.06 69.83 1010111 | 1 0.06 69.89 10101110 | 3 0.18 70.07 10101111 | 3 0.18 70.26 1011 | 1 0.06 70.32 10110 | 1 0.06 70.38 1011000 | 1 0.06 70.44 101100008 | 1 0.06 70.50 10110101 | 1 0.06 70.56 10110110 | 1 0.06 70.62 10111010 | 1 0.06 70.68 101110112 | 1 0.06 70.74 10111100 | 1 0.06 70.80 10111101 | 1 0.06 70.86 10111111 | 2 0.12 70.99 101111112 | 1 0.06 71.05 11 | 2 0.12 71.17 110 | 7 0.43 71.59 1100 | 11 0.67 72.26 11000 | 4 0.24 72.51 11000000 | 15 0.91 73.42 110000000 | 3 0.18 73.60 11000001 | 1 0.06 73.66 110000096 | 3 0.18 73.84 11000010 | 5 0.30 74.15 110000104 | 1 0.06 74.21 11000011 | 2 0.12 74.33 11000100 | 6 0.36 74.70 110001008 | 1 0.06 74.76 11000101 | 1 0.06 74.82 11000110 | 3 0.18 75.00 110001104 | 3 0.18 75.18 11000111 | 4 0.24 75.43 110001112 | 9 0.55 75.97 11001 | 1 0.06 76.03 11001000 | 3 0.18 76.22 110010008 | 1 0.06 76.28 110010096 | 1 0.06 76.34 11001010 | 3 0.18 76.52 11001011 | 5 0.30 76.82 110010112 | 1 0.06 76.89 11001100 | 4 0.24 77.13 11001101 | 6 0.36 77.49 1100111 | 1 0.06 77.55 11001110 | 6 0.36 77.92 110011104 | 1 0.06 77.98 11001111 | 21 1.28 79.26 110011112 | 4 0.24 79.50 1101 | 1 0.06 79.56 110100112 | 1 0.06 79.62 11010100 | 1 0.06 79.68 110101104 | 1 0.06 79.74 110101112 | 1 0.06 79.81 11011 | 1 0.06 79.87 1101100 | 1 0.06 79.93 11011000 | 2 0.12 80.05 110110000 | 1 0.06 80.11 11011010 | 1 0.06 80.17 11011011 | 1 0.06 80.23 11011100 | 6 0.36 80.60 110111008 | 2 0.12 80.72 11011101 | 1 0.06 80.78 110111104 | 1 0.06 80.84 11011111 | 29 1.76 82.60 110111112 | 7 0.43 83.03 111 | 1 0.06 83.09 1110 | 6 0.36 83.45 11100 | 4 0.24 83.70 111000 | 1 0.06 83.76 11100000 | 1 0.06 83.82 111000000 | 5 0.30 84.12 111000096 | 2 0.12 84.25 11100010 | 1 0.06 84.31 11100011 | 1 0.06 84.37 111000112 | 2 0.12 84.49 111001008 | 1 0.06 84.55 111001104 | 1 0.06 84.61 11100111 | 4 0.24 84.85 111001112 | 10 0.61 85.46 11101 | 1 0.06 85.52 11101001 | 2 0.12 85.64 11101011 | 2 0.12 85.77 11101100 | 1 0.06 85.83 11101101 | 2 0.12 85.95 11101110 | 3 0.18 86.13 11101111 | 13 0.79 86.92 111011112 | 10 0.61 87.53 1111 | 5 0.30 87.83 11110 | 1 0.06 87.90 11110000 | 1 0.06 87.96 11110100 | 1 0.06 88.02 11110101 | 1 0.06 88.08 111101104 | 1 0.06 88.14 111101112 | 15 0.91 89.05 11111 | 6 0.36 89.42 11111001 | 1 0.06 89.48 11111010 | 1 0.06 89.54 11111011 | 1 0.06 89.60 111111 | 1 0.06 89.66 11111100 | 2 0.12 89.78 11111110 | 3 0.18 89.96 111111104 | 4 0.24 90.21 11111111 | 24 1.46 91.67 111111112 | 137 8.33 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . // #4 . // make the code easier to debug . . use wf-flims, clear (Workflow data on functional limitations \ 2008-04-02) . generate flimall = hnd*100000000 /// > + hvy*10000000 /// > + lft*1000000 /// > + rch*100000 /// > + sit*10000 /// > + std*1000 /// > + stp*100 /// > + str*10 /// > + wlk . label var flimall "hnd-hvy-lft-rch-sit-std-stp-str-wlk" . . // #5 . // try a simpler example and make it easier to read . . use wf-flims, clear (Workflow data on functional limitations \ 2008-04-02) . generate flimall = std*1000 /// > + stp*100 /// > + str*10 /// > + wlk . generate sflimall=string(flimall,"%9.0f") . label var sflimall "std-stp-str-wlk" . tabulate sflimall, missing std-stp-str | -wlk | Freq. Percent Cum. ------------+----------------------------------- 0 | 866 52.68 52.68 1 | 16 0.97 53.65 10 | 24 1.46 55.11 100 | 80 4.87 59.98 1000 | 73 4.44 64.42 1001 | 13 0.79 65.21 101 | 8 0.49 65.69 1010 | 15 0.91 66.61 1011 | 25 1.52 68.13 11 | 13 0.79 68.92 110 | 24 1.46 70.38 1100 | 72 4.38 74.76 1101 | 27 1.64 76.40 111 | 20 1.22 77.62 1110 | 45 2.74 80.35 1111 | 323 19.65 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . // #6 . // keep adding one at a time till it breaks . . use wf-flims, clear (Workflow data on functional limitations \ 2008-04-02) . generate flimall = hvy*10000000 /// > + lft*1000000 /// > + rch*100000 /// > + sit*10000 /// > + std*1000 /// > + stp*100 /// > + str*10 /// > + wlk . generate sflimall = string(flimall,"%09.0f") . label var sflimall "hvy-lft-rch-sit-std-stp-str-wlk" . tabulate sflimall, missing hvy-lft-rch | -sit-std-st | p-str-wlk | Freq. Percent Cum. ------------+----------------------------------- 000000000 | 727 44.22 44.22 000000001 | 8 0.49 44.71 000000010 | 8 0.49 45.19 000000011 | 2 0.12 45.32 000000100 | 32 1.95 47.26 000000101 | 1 0.06 47.32 000000110 | 8 0.49 47.81 000000111 | 1 0.06 47.87 000001000 | 27 1.64 49.51 000001001 | 5 0.30 49.82 000001010 | 4 0.24 50.06 000001011 | 1 0.06 50.12 000001100 | 14 0.85 50.97 000001101 | 1 0.06 51.03 000001110 | 6 0.36 51.40 000001111 | 8 0.49 51.89 000010000 | 17 1.03 52.92 000010010 | 1 0.06 52.98 000010100 | 2 0.12 53.10 000010101 | 1 0.06 53.16 000010110 | 1 0.06 53.22 000011000 | 4 0.24 53.47 000011001 | 1 0.06 53.53 000011011 | 1 0.06 53.59 000011100 | 4 0.24 53.83 000011101 | 1 0.06 53.89 000011110 | 1 0.06 53.95 000011111 | 7 0.43 54.38 000100000 | 2 0.12 54.50 000100010 | 1 0.06 54.56 000100101 | 1 0.06 54.62 000101000 | 1 0.06 54.68 000101001 | 1 0.06 54.74 000101101 | 2 0.12 54.87 000101111 | 2 0.12 54.99 000111000 | 1 0.06 55.05 000111010 | 1 0.06 55.11 000111100 | 2 0.12 55.23 000111110 | 1 0.06 55.29 000111111 | 2 0.12 55.41 001000000 | 1 0.06 55.47 001000001 | 1 0.06 55.54 001000010 | 3 0.18 55.72 001000011 | 1 0.06 55.78 001000100 | 1 0.06 55.84 001000110 | 1 0.06 55.90 001000111 | 1 0.06 55.96 001001000 | 2 0.12 56.08 001001001 | 1 0.06 56.14 001001100 | 3 0.18 56.33 001001101 | 1 0.06 56.39 001001110 | 2 0.12 56.51 001001111 | 3 0.18 56.69 001010111 | 1 0.06 56.75 001011000 | 1 0.06 56.81 001100011 | 1 0.06 56.87 001100111 | 1 0.06 56.93 001101100 | 1 0.06 57.00 001110110 | 1 0.06 57.06 001111111 | 1 0.06 57.12 010000000 | 89 5.41 62.53 010000001 | 4 0.24 62.77 010000010 | 5 0.30 63.08 010000011 | 5 0.30 63.38 010000100 | 29 1.76 65.15 010000101 | 2 0.12 65.27 010000110 | 5 0.30 65.57 010000111 | 4 0.24 65.82 010001000 | 26 1.58 67.40 010001001 | 2 0.12 67.52 010001010 | 3 0.18 67.70 010001011 | 7 0.43 68.13 010001100 | 18 1.09 69.22 010001101 | 8 0.49 69.71 010001110 | 10 0.61 70.32 010001111 | 24 1.46 71.78 010010000 | 5 0.30 72.08 010010001 | 2 0.12 72.20 010010011 | 1 0.06 72.26 010010100 | 5 0.30 72.57 010010110 | 2 0.12 72.69 010010111 | 2 0.12 72.81 010011000 | 6 0.36 73.18 010011011 | 3 0.18 73.36 010011100 | 8 0.49 73.84 010011101 | 1 0.06 73.91 010011110 | 4 0.24 74.15 010011111 | 10 0.61 74.76 010100000 | 2 0.12 74.88 010100100 | 1 0.06 74.94 010100111 | 2 0.12 75.06 010101011 | 2 0.12 75.18 010101100 | 3 0.18 75.36 010101101 | 1 0.06 75.43 010101110 | 3 0.18 75.61 010101111 | 4 0.24 75.85 010110000 | 1 0.06 75.91 010110101 | 1 0.06 75.97 010110110 | 1 0.06 76.03 010111010 | 1 0.06 76.09 010111011 | 2 0.12 76.22 010111100 | 2 0.12 76.34 010111101 | 1 0.06 76.40 010111110 | 2 0.12 76.52 010111111 | 7 0.43 76.95 011000000 | 20 1.22 78.16 011000001 | 1 0.06 78.22 011000010 | 5 0.30 78.53 011000011 | 2 0.12 78.65 011000100 | 8 0.49 79.14 011000101 | 1 0.06 79.20 011000110 | 5 0.30 79.50 011000111 | 4 0.24 79.74 011001000 | 3 0.18 79.93 011001010 | 4 0.24 80.17 011001011 | 5 0.30 80.47 011001100 | 5 0.30 80.78 011001101 | 6 0.36 81.14 011001110 | 6 0.36 81.51 011001111 | 31 1.89 83.39 011010100 | 1 0.06 83.45 011011000 | 2 0.12 83.58 011011010 | 1 0.06 83.64 011011011 | 1 0.06 83.70 011011100 | 6 0.36 84.06 011011101 | 1 0.06 84.12 011011111 | 39 2.37 86.50 011100000 | 1 0.06 86.56 011100010 | 1 0.06 86.62 011100011 | 1 0.06 86.68 011100111 | 4 0.24 86.92 011101001 | 2 0.12 87.04 011101011 | 2 0.12 87.17 011101100 | 1 0.06 87.23 011101101 | 3 0.18 87.41 011101110 | 5 0.30 87.71 011101111 | 26 1.58 89.29 011110000 | 1 0.06 89.36 011110100 | 1 0.06 89.42 011110101 | 1 0.06 89.48 011111001 | 1 0.06 89.54 011111010 | 1 0.06 89.60 011111011 | 1 0.06 89.66 011111100 | 5 0.30 89.96 011111101 | 1 0.06 90.02 011111110 | 5 0.30 90.33 011111111 | 159 9.67 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . // #7 . // look at a value that is a problem . . list if sflimall == "111111112", clean . . // #8 . // fix the problem by using higher precisions . . use wf-flims, clear (Workflow data on functional limitations \ 2008-04-02) . generate double flimall = hnd*100000000 /// > + hvy*10000000 /// > + lft*1000000 /// > + rch*100000 /// > + sit*10000 /// > + std*1000 /// > + stp*100 /// > + str*10 /// > + wlk . generate sflimall = string(flimall,"%09.0f") . tabulate sflimall, missing sflimall | Freq. Percent Cum. ------------+----------------------------------- 000000000 | 715 43.49 43.49 000000001 | 5 0.30 43.80 000000010 | 8 0.49 44.28 000000011 | 2 0.12 44.40 000000100 | 28 1.70 46.11 000000101 | 1 0.06 46.17 000000110 | 7 0.43 46.59 000000111 | 1 0.06 46.65 000001000 | 26 1.58 48.24 000001001 | 5 0.30 48.54 000001010 | 3 0.18 48.72 000001011 | 1 0.06 48.78 000001100 | 11 0.67 49.45 000001101 | 1 0.06 49.51 000001110 | 6 0.36 49.88 000001111 | 5 0.30 50.18 000010000 | 14 0.85 51.03 000010010 | 1 0.06 51.09 000010100 | 2 0.12 51.22 000010101 | 1 0.06 51.28 000010110 | 1 0.06 51.34 000011000 | 4 0.24 51.58 000011001 | 1 0.06 51.64 000011011 | 1 0.06 51.70 000011100 | 4 0.24 51.95 000011101 | 1 0.06 52.01 000011110 | 1 0.06 52.07 000011111 | 6 0.36 52.43 000100000 | 2 0.12 52.55 000100010 | 1 0.06 52.62 000100101 | 1 0.06 52.68 000101001 | 1 0.06 52.74 000111000 | 1 0.06 52.80 000111111 | 1 0.06 52.86 001000000 | 1 0.06 52.92 001000001 | 1 0.06 52.98 001000010 | 3 0.18 53.16 001000011 | 1 0.06 53.22 001000100 | 1 0.06 53.28 001000110 | 1 0.06 53.35 001000111 | 1 0.06 53.41 001001000 | 1 0.06 53.47 001001001 | 1 0.06 53.53 001001100 | 2 0.12 53.65 001001101 | 1 0.06 53.71 001001110 | 1 0.06 53.77 001001111 | 2 0.12 53.89 001010111 | 1 0.06 53.95 001011000 | 1 0.06 54.01 001100111 | 1 0.06 54.08 001101100 | 1 0.06 54.14 010000000 | 86 5.23 59.37 010000001 | 4 0.24 59.61 010000010 | 5 0.30 59.91 010000011 | 5 0.30 60.22 010000100 | 26 1.58 61.80 010000101 | 1 0.06 61.86 010000110 | 5 0.30 62.17 010000111 | 4 0.24 62.41 010001000 | 26 1.58 63.99 010001001 | 2 0.12 64.11 010001010 | 3 0.18 64.29 010001011 | 6 0.36 64.66 010001100 | 16 0.97 65.63 010001101 | 7 0.43 66.06 010001110 | 9 0.55 66.61 010001111 | 16 0.97 67.58 010010000 | 5 0.30 67.88 010010001 | 2 0.12 68.00 010010100 | 4 0.24 68.25 010010110 | 2 0.12 68.37 010010111 | 1 0.06 68.43 010011000 | 6 0.36 68.80 010011011 | 3 0.18 68.98 010011100 | 8 0.49 69.46 010011110 | 4 0.24 69.71 010011111 | 6 0.36 70.07 010100000 | 2 0.12 70.19 010100100 | 1 0.06 70.26 010100111 | 1 0.06 70.32 010101011 | 2 0.12 70.44 010101100 | 2 0.12 70.56 010101101 | 1 0.06 70.62 010101110 | 3 0.18 70.80 010101111 | 3 0.18 70.99 010110101 | 1 0.06 71.05 010110110 | 1 0.06 71.11 010111010 | 1 0.06 71.17 010111100 | 1 0.06 71.23 010111101 | 1 0.06 71.29 010111111 | 2 0.12 71.41 011000000 | 15 0.91 72.32 011000001 | 1 0.06 72.38 011000010 | 5 0.30 72.69 011000011 | 2 0.12 72.81 011000100 | 6 0.36 73.18 011000101 | 1 0.06 73.24 011000110 | 3 0.18 73.42 011000111 | 4 0.24 73.66 011001000 | 3 0.18 73.84 011001010 | 3 0.18 74.03 011001011 | 5 0.30 74.33 011001100 | 4 0.24 74.57 011001101 | 6 0.36 74.94 011001110 | 6 0.36 75.30 011001111 | 21 1.28 76.58 011010100 | 1 0.06 76.64 011011000 | 2 0.12 76.76 011011010 | 1 0.06 76.82 011011011 | 1 0.06 76.89 011011100 | 6 0.36 77.25 011011101 | 1 0.06 77.31 011011111 | 29 1.76 79.08 011100000 | 1 0.06 79.14 011100010 | 1 0.06 79.20 011100011 | 1 0.06 79.26 011100111 | 4 0.24 79.50 011101001 | 2 0.12 79.62 011101011 | 2 0.12 79.74 011101100 | 1 0.06 79.81 011101101 | 2 0.12 79.93 011101110 | 3 0.18 80.11 011101111 | 13 0.79 80.90 011110000 | 1 0.06 80.96 011110100 | 1 0.06 81.02 011110101 | 1 0.06 81.08 011111001 | 1 0.06 81.14 011111010 | 1 0.06 81.20 011111011 | 1 0.06 81.27 011111100 | 2 0.12 81.39 011111110 | 3 0.18 81.57 011111111 | 24 1.46 83.03 100000000 | 12 0.73 83.76 100000001 | 3 0.18 83.94 100000100 | 4 0.24 84.18 100000110 | 1 0.06 84.25 100001000 | 1 0.06 84.31 100001010 | 1 0.06 84.37 100001100 | 3 0.18 84.55 100001111 | 3 0.18 84.73 100010000 | 3 0.18 84.91 100011111 | 1 0.06 84.98 100101000 | 1 0.06 85.04 100101101 | 2 0.12 85.16 100101111 | 2 0.12 85.28 100111010 | 1 0.06 85.34 100111100 | 2 0.12 85.46 100111110 | 1 0.06 85.52 100111111 | 1 0.06 85.58 101001000 | 1 0.06 85.64 101001100 | 1 0.06 85.71 101001110 | 1 0.06 85.77 101001111 | 1 0.06 85.83 101100011 | 1 0.06 85.89 101110110 | 1 0.06 85.95 101111111 | 1 0.06 86.01 110000000 | 3 0.18 86.19 110000100 | 3 0.18 86.37 110000101 | 1 0.06 86.44 110001011 | 1 0.06 86.50 110001100 | 2 0.12 86.62 110001101 | 1 0.06 86.68 110001110 | 1 0.06 86.74 110001111 | 8 0.49 87.23 110010011 | 1 0.06 87.29 110010100 | 1 0.06 87.35 110010111 | 1 0.06 87.41 110011101 | 1 0.06 87.47 110011111 | 4 0.24 87.71 110100111 | 1 0.06 87.77 110101100 | 1 0.06 87.83 110101111 | 1 0.06 87.90 110110000 | 1 0.06 87.96 110111011 | 2 0.12 88.08 110111100 | 1 0.06 88.14 110111110 | 2 0.12 88.26 110111111 | 5 0.30 88.56 111000000 | 5 0.30 88.87 111000100 | 2 0.12 88.99 111000110 | 2 0.12 89.11 111001010 | 1 0.06 89.17 111001100 | 1 0.06 89.23 111001111 | 10 0.61 89.84 111011111 | 10 0.61 90.45 111101101 | 1 0.06 90.51 111101110 | 2 0.12 90.63 111101111 | 13 0.79 91.42 111111100 | 3 0.18 91.61 111111101 | 1 0.06 91.67 111111110 | 2 0.12 91.79 111111111 | 135 8.21 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . // #9 - additional material . // here is an alternative approach that only uses strings . . use wf-flims, clear (Workflow data on functional limitations \ 2008-04-02) . foreach v in hnd hvy lft rch sit std stp str wlk { 2. generate s`v' = string(`v',"%1.0f") 3. } . generate sflimall = shnd + shvy + slft + srch + ssit /// > + sstd + sstp + sstr + swlk . label var sflimall "hnd-hvy-lft-rch-sit-std-stp-str-wlk" . tabulate sflimall, missing hnd-hvy-lft | -rch-sit-st | d-stp-str-w | lk | Freq. Percent Cum. ------------+----------------------------------- 000000000 | 715 43.49 43.49 000000001 | 5 0.30 43.80 000000010 | 8 0.49 44.28 000000011 | 2 0.12 44.40 000000100 | 28 1.70 46.11 000000101 | 1 0.06 46.17 000000110 | 7 0.43 46.59 000000111 | 1 0.06 46.65 000001000 | 26 1.58 48.24 000001001 | 5 0.30 48.54 000001010 | 3 0.18 48.72 000001011 | 1 0.06 48.78 000001100 | 11 0.67 49.45 000001101 | 1 0.06 49.51 000001110 | 6 0.36 49.88 000001111 | 5 0.30 50.18 000010000 | 14 0.85 51.03 000010010 | 1 0.06 51.09 000010100 | 2 0.12 51.22 000010101 | 1 0.06 51.28 000010110 | 1 0.06 51.34 000011000 | 4 0.24 51.58 000011001 | 1 0.06 51.64 000011011 | 1 0.06 51.70 000011100 | 4 0.24 51.95 000011101 | 1 0.06 52.01 000011110 | 1 0.06 52.07 000011111 | 6 0.36 52.43 000100000 | 2 0.12 52.55 000100010 | 1 0.06 52.62 000100101 | 1 0.06 52.68 000101001 | 1 0.06 52.74 000111000 | 1 0.06 52.80 000111111 | 1 0.06 52.86 001000000 | 1 0.06 52.92 001000001 | 1 0.06 52.98 001000010 | 3 0.18 53.16 001000011 | 1 0.06 53.22 001000100 | 1 0.06 53.28 001000110 | 1 0.06 53.35 001000111 | 1 0.06 53.41 001001000 | 1 0.06 53.47 001001001 | 1 0.06 53.53 001001100 | 2 0.12 53.65 001001101 | 1 0.06 53.71 001001110 | 1 0.06 53.77 001001111 | 2 0.12 53.89 001010111 | 1 0.06 53.95 001011000 | 1 0.06 54.01 001100111 | 1 0.06 54.08 001101100 | 1 0.06 54.14 010000000 | 86 5.23 59.37 010000001 | 4 0.24 59.61 010000010 | 5 0.30 59.91 010000011 | 5 0.30 60.22 010000100 | 26 1.58 61.80 010000101 | 1 0.06 61.86 010000110 | 5 0.30 62.17 010000111 | 4 0.24 62.41 010001000 | 26 1.58 63.99 010001001 | 2 0.12 64.11 010001010 | 3 0.18 64.29 010001011 | 6 0.36 64.66 010001100 | 16 0.97 65.63 010001101 | 7 0.43 66.06 010001110 | 9 0.55 66.61 010001111 | 16 0.97 67.58 010010000 | 5 0.30 67.88 010010001 | 2 0.12 68.00 010010100 | 4 0.24 68.25 010010110 | 2 0.12 68.37 010010111 | 1 0.06 68.43 010011000 | 6 0.36 68.80 010011011 | 3 0.18 68.98 010011100 | 8 0.49 69.46 010011110 | 4 0.24 69.71 010011111 | 6 0.36 70.07 010100000 | 2 0.12 70.19 010100100 | 1 0.06 70.26 010100111 | 1 0.06 70.32 010101011 | 2 0.12 70.44 010101100 | 2 0.12 70.56 010101101 | 1 0.06 70.62 010101110 | 3 0.18 70.80 010101111 | 3 0.18 70.99 010110101 | 1 0.06 71.05 010110110 | 1 0.06 71.11 010111010 | 1 0.06 71.17 010111100 | 1 0.06 71.23 010111101 | 1 0.06 71.29 010111111 | 2 0.12 71.41 011000000 | 15 0.91 72.32 011000001 | 1 0.06 72.38 011000010 | 5 0.30 72.69 011000011 | 2 0.12 72.81 011000100 | 6 0.36 73.18 011000101 | 1 0.06 73.24 011000110 | 3 0.18 73.42 011000111 | 4 0.24 73.66 011001000 | 3 0.18 73.84 011001010 | 3 0.18 74.03 011001011 | 5 0.30 74.33 011001100 | 4 0.24 74.57 011001101 | 6 0.36 74.94 011001110 | 6 0.36 75.30 011001111 | 21 1.28 76.58 011010100 | 1 0.06 76.64 011011000 | 2 0.12 76.76 011011010 | 1 0.06 76.82 011011011 | 1 0.06 76.89 011011100 | 6 0.36 77.25 011011101 | 1 0.06 77.31 011011111 | 29 1.76 79.08 011100000 | 1 0.06 79.14 011100010 | 1 0.06 79.20 011100011 | 1 0.06 79.26 011100111 | 4 0.24 79.50 011101001 | 2 0.12 79.62 011101011 | 2 0.12 79.74 011101100 | 1 0.06 79.81 011101101 | 2 0.12 79.93 011101110 | 3 0.18 80.11 011101111 | 13 0.79 80.90 011110000 | 1 0.06 80.96 011110100 | 1 0.06 81.02 011110101 | 1 0.06 81.08 011111001 | 1 0.06 81.14 011111010 | 1 0.06 81.20 011111011 | 1 0.06 81.27 011111100 | 2 0.12 81.39 011111110 | 3 0.18 81.57 011111111 | 24 1.46 83.03 100000000 | 12 0.73 83.76 100000001 | 3 0.18 83.94 100000100 | 4 0.24 84.18 100000110 | 1 0.06 84.25 100001000 | 1 0.06 84.31 100001010 | 1 0.06 84.37 100001100 | 3 0.18 84.55 100001111 | 3 0.18 84.73 100010000 | 3 0.18 84.91 100011111 | 1 0.06 84.98 100101000 | 1 0.06 85.04 100101101 | 2 0.12 85.16 100101111 | 2 0.12 85.28 100111010 | 1 0.06 85.34 100111100 | 2 0.12 85.46 100111110 | 1 0.06 85.52 100111111 | 1 0.06 85.58 101001000 | 1 0.06 85.64 101001100 | 1 0.06 85.71 101001110 | 1 0.06 85.77 101001111 | 1 0.06 85.83 101100011 | 1 0.06 85.89 101110110 | 1 0.06 85.95 101111111 | 1 0.06 86.01 110000000 | 3 0.18 86.19 110000100 | 3 0.18 86.37 110000101 | 1 0.06 86.44 110001011 | 1 0.06 86.50 110001100 | 2 0.12 86.62 110001101 | 1 0.06 86.68 110001110 | 1 0.06 86.74 110001111 | 8 0.49 87.23 110010011 | 1 0.06 87.29 110010100 | 1 0.06 87.35 110010111 | 1 0.06 87.41 110011101 | 1 0.06 87.47 110011111 | 4 0.24 87.71 110100111 | 1 0.06 87.77 110101100 | 1 0.06 87.83 110101111 | 1 0.06 87.90 110110000 | 1 0.06 87.96 110111011 | 2 0.12 88.08 110111100 | 1 0.06 88.14 110111110 | 2 0.12 88.26 110111111 | 5 0.30 88.56 111000000 | 5 0.30 88.87 111000100 | 2 0.12 88.99 111000110 | 2 0.12 89.11 111001010 | 1 0.06 89.17 111001100 | 1 0.06 89.23 111001111 | 10 0.61 89.84 111011111 | 10 0.61 90.45 111101101 | 1 0.06 90.51 111101110 | 2 0.12 90.63 111101111 | 13 0.79 91.42 111111100 | 3 0.18 91.61 111111101 | 1 0.06 91.67 111111110 | 2 0.12 91.79 111111111 | 135 8.21 100.00 ------------+----------------------------------- Total | 1,644 100.00 . . log close log: D:\wf\work\wf3-debug-precision.log log type: text closed on: 24 Oct 2008, 09:41:01 -------------------------------------------------------------------------------- . exit end of do-file . . log close master log: D:\wf\work\wf3.log log type: text closed on: 24 Oct 2008, 09:41:01 -------------------------------------------------------------------------------- . exit end of do-file . do wf4.do . capture log close master . log using wf4, name(master) replace text (note: file D:\wf\work\wf4.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4.log log type: text opened on: 24 Oct 2008, 09:41:01 . . // program: wf4.do \ for stata 9 . // task: run all do-files in the order they appear . // project: workflow - chapter 4 . // author: scott long \ 2008-10-24 . . * macros . do wf4-macros.do, nostop . capture log close . log using wf4-macros, replace text (note: file D:\wf\work\wf4-macros.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-macros.log log type: text opened on: 24 Oct 2008, 09:41:01 . . // program: wf4-macros.do \ for stata 9 . // task: macro examples . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // examples of simple macros . . * defining and displaying local macros . local rhs "var1 var2 var3 var4" . display "The local macro rhs contains: `rhs'" The local macro rhs contains: var1 var2 var3 var4 . . local ncases = 198 . display "The local ncases equals: `ncases'" The local ncases equals: 198 . . . * defining and displaying global macros . global rhs "var1 var2 var3 var4" . display "The local macro rhs contains: $rhs" The local macro rhs contains: var1 var2 var3 var4 . . global ncases = 198 . display "The local ncases equals: $ncases" The local ncases equals: 198 . . * using double quotes . local myvars "y x1 x2" . display ">`myvars'<" >y x1 x2< . . local myvars y x1 x2 . display ">`myvars'<" >y x1 x2< . . // #2 . // entering long strings . . * create the macros in one step . local demogvars "female black hispanic age agesq edhighschl edcollege edpostgr > ad incdollars childsqrt" . display "Start of macro=>`demogvars'<=End of macro" Start of macro=>female black hispanic age agesq edhighschl edcollege edpostgrad > incdollars childsqrt<=End of macro . . * creating a long macro in steps . local demogvars "female black hispanic age agesq" . local demogvars "`demogvars' edhighschl edcollege edpostgrad" . local demogvars "`demogvars' incdollars childsqrt" . . // #3 . // using macros for a list of variables . . use wf-macros, clear (Workflow data for illustrating macros \ 2008-04-02) . . summarize lfp k5 k618 age wc hc lwg inc Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 k5 | 753 .2377158 .523959 0 3 k618 | 753 1.353254 1.319874 0 8 age | 753 42.53785 8.072574 30 60 wc | 753 .2815405 .4500494 0 1 -------------+-------------------------------------------------------- hc | 753 .3917663 .4884694 0 1 lwg | 753 1.097115 .5875564 -2.054124 3.218876 inc | 753 20.12897 11.6348 -.0290001 96 . logit lfp k5 k618 age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -454.32339 Iteration 2: log likelihood = -452.64187 Iteration 3: log likelihood = -452.63296 Iteration 4: log likelihood = -452.63296 Logistic regression Number of obs = 753 LR chi2(7) = 124.48 Prob > chi2 = 0.0000 Log likelihood = -452.63296 Pseudo R2 = 0.1209 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.462913 .1970006 -7.43 0.000 -1.849027 -1.076799 k618 | -.0645707 .0680008 -0.95 0.342 -.1978499 .0687085 age | -.0628706 .0127831 -4.92 0.000 -.0879249 -.0378162 wc | .8072738 .2299799 3.51 0.000 .3565215 1.258026 hc | .1117336 .2060397 0.54 0.588 -.2920969 .515564 lwg | .6046931 .1508176 4.01 0.000 .3090961 .9002901 inc | -.0344464 .0082084 -4.20 0.000 -.0505346 -.0183583 _cons | 3.18214 .6443751 4.94 0.000 1.919188 4.445092 ------------------------------------------------------------------------------ . . summarize lfp k5 k618 age agesquared wc lwg inc Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 k5 | 753 .2377158 .523959 0 3 k618 | 753 1.353254 1.319874 0 8 age | 753 42.53785 8.072574 30 60 agesquared | 753 1874.548 699.5167 900 3600 -------------+-------------------------------------------------------- wc | 753 .2815405 .4500494 0 1 lwg | 753 1.097115 .5875564 -2.054124 3.218876 inc | 753 20.12897 11.6348 -.0290001 96 . logit lfp k5 k618 age agesquared wc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -453.83503 Iteration 2: log likelihood = -452.25621 Iteration 3: log likelihood = -452.24834 Iteration 4: log likelihood = -452.24834 Logistic regression Number of obs = 753 LR chi2(7) = 125.25 Prob > chi2 = 0.0000 Log likelihood = -452.24834 Pseudo R2 = 0.1216 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.406608 .1996585 -7.05 0.000 -1.797932 -1.015285 k618 | -.0802251 .0695772 -1.15 0.249 -.2165939 .0561437 age | .0573735 .1180462 0.49 0.627 -.1739928 .2887399 agesquared | -.0013906 .0013508 -1.03 0.303 -.0040382 .0012569 wc | .8748639 .2071886 4.22 0.000 .4687818 1.280946 lwg | .592987 .1505529 3.94 0.000 .2979088 .8880653 inc | -.034011 .0079067 -4.30 0.000 -.0495078 -.0185141 _cons | .7103612 2.50828 0.28 0.777 -4.205777 5.6265 ------------------------------------------------------------------------------ . . local myvars "lfp k5 k618 age wc hc lwg inc" . summarize `myvars' Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 k5 | 753 .2377158 .523959 0 3 k618 | 753 1.353254 1.319874 0 8 age | 753 42.53785 8.072574 30 60 wc | 753 .2815405 .4500494 0 1 -------------+-------------------------------------------------------- hc | 753 .3917663 .4884694 0 1 lwg | 753 1.097115 .5875564 -2.054124 3.218876 inc | 753 20.12897 11.6348 -.0290001 96 . logit `myvars' Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -454.32339 Iteration 2: log likelihood = -452.64187 Iteration 3: log likelihood = -452.63296 Iteration 4: log likelihood = -452.63296 Logistic regression Number of obs = 753 LR chi2(7) = 124.48 Prob > chi2 = 0.0000 Log likelihood = -452.63296 Pseudo R2 = 0.1209 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.462913 .1970006 -7.43 0.000 -1.849027 -1.076799 k618 | -.0645707 .0680008 -0.95 0.342 -.1978499 .0687085 age | -.0628706 .0127831 -4.92 0.000 -.0879249 -.0378162 wc | .8072738 .2299799 3.51 0.000 .3565215 1.258026 hc | .1117336 .2060397 0.54 0.588 -.2920969 .515564 lwg | .6046931 .1508176 4.01 0.000 .3090961 .9002901 inc | -.0344464 .0082084 -4.20 0.000 -.0505346 -.0183583 _cons | 3.18214 .6443751 4.94 0.000 1.919188 4.445092 ------------------------------------------------------------------------------ . . local myvars "lfp k5 k618 age agesquared wc lwg inc" . summarize `myvars' Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 k5 | 753 .2377158 .523959 0 3 k618 | 753 1.353254 1.319874 0 8 age | 753 42.53785 8.072574 30 60 agesquared | 753 1874.548 699.5167 900 3600 -------------+-------------------------------------------------------- wc | 753 .2815405 .4500494 0 1 lwg | 753 1.097115 .5875564 -2.054124 3.218876 inc | 753 20.12897 11.6348 -.0290001 96 . logit `myvars' Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -453.83503 Iteration 2: log likelihood = -452.25621 Iteration 3: log likelihood = -452.24834 Iteration 4: log likelihood = -452.24834 Logistic regression Number of obs = 753 LR chi2(7) = 125.25 Prob > chi2 = 0.0000 Log likelihood = -452.24834 Pseudo R2 = 0.1216 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.406608 .1996585 -7.05 0.000 -1.797932 -1.015285 k618 | -.0802251 .0695772 -1.15 0.249 -.2165939 .0561437 age | .0573735 .1180462 0.49 0.627 -.1739928 .2887399 agesquared | -.0013906 .0013508 -1.03 0.303 -.0040382 .0012569 wc | .8748639 .2071886 4.22 0.000 .4687818 1.280946 lwg | .592987 .1505529 3.94 0.000 .2979088 .8880653 inc | -.034011 .0079067 -4.30 0.000 -.0495078 -.0185141 _cons | .7103612 2.50828 0.28 0.777 -4.205777 5.6265 ------------------------------------------------------------------------------ . . // #3 . // using macros to specify nested models . . * create locals with sets of variables . local set1_age "age agesquared" . local set2_educ "wc hc" . local set3_kids "k5 k618" . local set4_money "lwg inc" . . display " set1_age: `set1_age'" set1_age: age agesquared . display " set2_educ: `set2_educ'" set2_educ: wc hc . display " set3_kids: `set3_kids'" set3_kids: k5 k618 . display "set4_money: `set4_money'" set4_money: lwg inc . . * specify nested models . local model_1 "`set1_age'" . local model_2 "`model_1' `set2_educ'" . local model_3 "`model_2' `set3_kids'" . local model_4 "`model_3' `set4_money'" . . * check the model specifications . display "model_1: `model_1'" model_1: age agesquared . display "model_2: `model_2'" model_2: age agesquared wc hc . display "model_3: `model_3'" model_3: age agesquared wc hc k5 k618 . display "model_4: `model_4'" model_4: age agesquared wc hc k5 k618 lwg inc . . * run the nested models . logit lfp `model_1' Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -509.883 Iteration 2: log likelihood = -509.88268 Logistic regression Number of obs = 753 LR chi2(2) = 9.98 Prob > chi2 = 0.0068 Log likelihood = -509.88268 Pseudo R2 = 0.0097 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .2063445 .1011173 2.04 0.041 .0081582 .4045309 agesquared | -.0026239 .001168 -2.25 0.025 -.0049131 -.0003347 _cons | -3.581785 2.131194 -1.68 0.093 -7.758849 .5952794 ------------------------------------------------------------------------------ . logit lfp `model_2' Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -502.00571 Iteration 2: log likelihood = -501.97961 Iteration 3: log likelihood = -501.97961 Logistic regression Number of obs = 753 LR chi2(4) = 25.79 Prob > chi2 = 0.0000 Log likelihood = -501.97961 Pseudo R2 = 0.0250 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .2221776 .1026553 2.16 0.030 .0209769 .4233783 agesquared | -.0027994 .0011843 -2.36 0.018 -.0051207 -.0004781 wc | .7499589 .2021661 3.71 0.000 .3537207 1.146197 hc | -.166444 .1834571 -0.91 0.364 -.5260133 .1931254 _cons | -4.063946 2.173895 -1.87 0.062 -8.324702 .1968106 ------------------------------------------------------------------------------ . logit lfp `model_3' Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -470.13082 Iteration 2: log likelihood = -469.51354 Iteration 3: log likelihood = -469.51238 Iteration 4: log likelihood = -469.51238 Logistic regression Number of obs = 753 LR chi2(6) = 90.72 Prob > chi2 = 0.0000 Log likelihood = -469.51238 Pseudo R2 = 0.0881 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0296018 .114766 0.26 0.796 -.1953355 .2545391 agesquared | -.0011262 .0013128 -0.86 0.391 -.0036993 .0014468 wc | .8794929 .2158572 4.07 0.000 .4564206 1.302565 hc | -.1097373 .1928231 -0.57 0.569 -.4876635 .268189 k5 | -1.409346 .1961559 -7.18 0.000 -1.793804 -1.024887 k618 | -.1210794 .0678917 -1.78 0.075 -.2541447 .0119859 _cons | 1.446109 2.462198 0.59 0.557 -3.379711 6.271929 ------------------------------------------------------------------------------ . logit lfp `model_4' Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -453.66035 Iteration 2: log likelihood = -452.04667 Iteration 3: log likelihood = -452.03836 Iteration 4: log likelihood = -452.03836 Logistic regression Number of obs = 753 LR chi2(8) = 125.67 Prob > chi2 = 0.0000 Log likelihood = -452.03836 Pseudo R2 = 0.1220 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0659135 .1188199 0.55 0.579 -.1669693 .2987962 agesquared | -.0014784 .0013584 -1.09 0.276 -.0041408 .001184 wc | .8098626 .2299065 3.52 0.000 .3592542 1.260471 hc | .1340998 .207023 0.65 0.517 -.2716579 .5398575 k5 | -1.411597 .2001829 -7.05 0.000 -1.803948 -1.019246 k618 | -.0815087 .0696247 -1.17 0.242 -.2179706 .0549531 lwg | .5925741 .1507807 3.93 0.000 .2970495 .8880988 inc | -.0355964 .0083188 -4.28 0.000 -.0519009 -.0192919 _cons | .511489 2.527194 0.20 0.840 -4.44172 5.464698 ------------------------------------------------------------------------------ . . log close log: D:\wf\work\wf4-macros.log log type: text closed on: 24 Oct 2008, 09:41:01 -------------------------------------------------------------------------------- . exit end of do-file . do wf4-macros-graph.do, nostop . capture log close . log using wf4-macros-graph, replace text (note: file D:\wf\work\wf4-macros-graph.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-macros-graph.log log type: text opened on: 24 Oct 2008, 09:41:01 . . // program: wf4-macros-graph.do \ for stata 9 . // task: macros for setting graph options . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . set scheme s2manual . . // #1 . // load data . . use wf-macros, clear (Workflow data for illustrating macros \ 2008-04-02) . . // #2 . // complicated graph without using macros . . graph twoway /// > (connected pr_women articles, lpattern(solid) lwidth(medthick) /// > lcolor(black) msymbol(i)) /// > (connected pr_men articles, lpattern(dash) lwidth(medthick) /// > lcolor(black) msymbol(i)) /// > , ylabel(0(.2)1., grid glwidth(medium) glpattern(dash)) xlabel(0(10)50) /// > ytitle("Probability of tenure") /// > legend(pos(11) order(2 1) ring(0) cols(1)) . graph export wf4-macros-graph.eps, replace (note: file wf4-macros-graph.eps not found) (file wf4-macros-graph.eps written in EPS format) . . // #3 . // options defined in locals . . * line characteristics . local opt_linF "lpattern(solid) lwidth(medthick) lcolor(black) msymbol(i)" . local opt_linM "lpattern(dash) lwidth(medthick) lcolor(black) msymbol(i)" . * grid options . local opt_ygrid "grid glwidth(medium) glpattern(dash)" . * legend options . local opt_legend "pos(11) order(2 1) ring(0) cols(1)" . . graph twoway /// > (connected pr_women articles, `opt_linF') /// > (connected pr_men articles, `opt_linM') /// > , xlabel(0(10)50) ylabel(0(.2)1., `opt_ygrid') /// > ytitle("Probability of tenure") /// > legend(`opt_legend') . . // #4 . // change to colored lines . . * line characteristics . local opt_linF "lpattern(solid) lwidth(medthick) lcolor(red) msymbol(i)" . local opt_linM "lpattern(dash) lwidth(medthick) lcolor(blue) msymbol(i)" . . graph twoway /// > (connected pr_women articles, `opt_linF') /// > (connected pr_men articles, `opt_linM') /// > , xlabel(0(10)50) ylabel(0(.2)1., `opt_ygrid') /// > ytitle("Probability of tenure") /// > legend(`opt_legend') . local opt_linF "clpat(solid) clwidth(medthick) clcolor(blue)" . local opt_linM "clpat(solid) clwidth(medthick) clcolor(red) " . . log close log: D:\wf\work\wf4-macros-graph.log log type: text closed on: 24 Oct 2008, 09:41:04 -------------------------------------------------------------------------------- . exit end of do-file . . * returned results . do wf4-returned.do, nostop . capture log close . log using wf4-returned, replace text (note: file D:\wf\work\wf4-returned.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-returned.log log type: text opened on: 24 Oct 2008, 09:41:04 . . // program: wf4-returned.do \ for stata 9 . // task: using returned results . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // centering age by hand . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . summarize age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 753 42.53785 8.072574 30 60 . generate age_mean = age - 42.53785 . label var age_mean "age - mean(age)" . summarize age_mean Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age_mean | 753 -1.49e-06 8.072574 -12.53785 17.46215 . . // #2 . // centering with return results . . summarize age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 753 42.53785 8.072574 30 60 . return list scalars: r(N) = 753 r(sum_w) = 753 r(mean) = 42.53784860557769 r(Var) = 65.16645121641095 r(sd) = 8.072574014303674 r(min) = 30 r(max) = 60 r(sum) = 32031 . generate age_meanV2 = age - r(mean) . label var age_meanV2 "age - mean(age)" . summarize age_mean age_meanV2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age_mean | 753 -1.49e-06 8.072574 -12.53785 17.46215 age_meanV2 | 753 6.29e-08 8.072574 -12.53785 17.46215 . . summarize age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 753 42.53785 8.072574 30 60 . generate double age_meanV3 = age - r(mean) . label var age_meanV3 "age - mean(age) using double precision" . summarize age_mean age_meanV2 age_meanV3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age_mean | 753 -1.49e-06 8.072574 -12.53785 17.46215 age_meanV2 | 753 6.29e-08 8.072574 -12.53785 17.46215 age_meanV3 | 753 3.14e-15 8.072574 -12.53785 17.46215 . . // #3 . // adding returns to a local . . summarize age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 753 42.53785 8.072574 30 60 . local mean_age = r(mean) . local sd_age = r(sd) . display "The mean of age `mean_age' (sd=`sd_age')." The mean of age 42.53784860557769 (sd=8.072574014303674). . . * rounding the result . local mean_agefmt = string(r(mean),"%8.3f") . local sd_agefmt = string(r(sd),"%8.3f") . display "The mean of age `mean_agefmt' (sd=`sd_agefmt')." The mean of age 42.538 (sd=8.073). . . * if you don't want to use locals, you can do this . display "The mean of age " %8.3f as result r(mean) /// > " (sd=" %8.3f as result r(sd) ")" The mean of age 42.538 (sd= 8.073) . . log close log: D:\wf\work\wf4-returned.log log type: text closed on: 24 Oct 2008, 09:41:04 -------------------------------------------------------------------------------- . exit end of do-file . . * loops . do wf4-loops.do, nostop . capture log close . log using wf4-loops, replace text (note: file D:\wf\work\wf4-loops.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-loops.log log type: text opened on: 24 Oct 2008, 09:41:04 . . // program: wf4-loops.do \ for stata 9 . // task: examples of loops . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . set trace off . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // create binary variables using a loop . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . * without loop . generate y_lt2 = y<2 if !missing(y) . generate y_lt3 = y<3 if !missing(y) . generate y_lt4 = y<4 if !missing(y) . . * drop these do I can create them another way . drop y_lt2 y_lt3 y_lt4 . . * with foreach loop . foreach cutpt in 2 3 4 { 2. generate y_lt`cutpt' = y<`cutpt' if !missing(y) 3. } . . * drop these do I can create them another way . drop y_lt2 y_lt3 y_lt4 . . * with forvalues loop . forvalues cutpt = 2(1)4 { 2. generate y_lt`cutpt' = y<`cutpt' if !missing(y) 3. } . . // #2 . // estimate models using a loop . . local rhs "yr89 male white age ed prst" . . * without a loop . logit y_lt2 `rhs' Iteration 0: log likelihood = -883.91038 Iteration 1: log likelihood = -824.35787 Iteration 2: log likelihood = -819.6587 Iteration 3: log likelihood = -819.61993 Iteration 4: log likelihood = -819.61992 Logistic regression Number of obs = 2293 LR chi2(6) = 128.58 Prob > chi2 = 0.0000 Log likelihood = -819.61992 Pseudo R2 = 0.0727 ------------------------------------------------------------------------------ y_lt2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.9647422 .1542064 -6.26 0.000 -1.266981 -.6625033 male | .3053643 .1291546 2.36 0.018 .052226 .5585025 white | .5526576 .2305396 2.40 0.017 .1008082 1.004507 age | .0164704 .0040571 4.06 0.000 .0085187 .0244221 ed | -.1047962 .0253348 -4.14 0.000 -.1544516 -.0551409 prst | .0014112 .0056702 0.25 0.803 -.0097023 .0125246 _cons | -1.858405 .3958164 -4.70 0.000 -2.63419 -1.082619 ------------------------------------------------------------------------------ . logit y_lt3 `rhs' Iteration 0: log likelihood = -1575.4005 Iteration 1: log likelihood = -1450.7598 Iteration 2: log likelihood = -1449.7869 Iteration 3: log likelihood = -1449.7863 Logistic regression Number of obs = 2293 LR chi2(6) = 251.23 Prob > chi2 = 0.0000 Log likelihood = -1449.7863 Pseudo R2 = 0.0797 ------------------------------------------------------------------------------ y_lt3 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.5654063 .0928433 -6.09 0.000 -.7473757 -.3834368 male | .6905423 .0898786 7.68 0.000 .5143834 .8667012 white | .3142708 .1405978 2.24 0.025 .0387042 .5898374 age | .0253345 .0028644 8.84 0.000 .0197203 .0309486 ed | -.0528527 .0184571 -2.86 0.004 -.0890279 -.0166774 prst | -.0095322 .0038184 -2.50 0.013 -.0170162 -.0020482 _cons | -.7303287 .269163 -2.71 0.007 -1.257878 -.202779 ------------------------------------------------------------------------------ . logit y_lt4 `rhs' Iteration 0: log likelihood = -1499.7318 Iteration 1: log likelihood = -1429.2806 Iteration 2: log likelihood = -1428.6455 Iteration 3: log likelihood = -1428.6452 Logistic regression Number of obs = 2293 LR chi2(6) = 142.17 Prob > chi2 = 0.0000 Log likelihood = -1428.6452 Pseudo R2 = 0.0474 ------------------------------------------------------------------------------ y_lt4 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.3440164 .0920766 -3.74 0.000 -.5244832 -.1635495 male | .6559833 .0914586 7.17 0.000 .4767276 .8352389 white | .2260828 .1352834 1.67 0.095 -.0390678 .4912334 age | .0181817 .0029363 6.19 0.000 .0124266 .0239367 ed | -.0415125 .0192516 -2.16 0.031 -.0792449 -.00378 prst | -.006295 .0038444 -1.64 0.102 -.0138299 .0012399 _cons | .1849854 .2735326 0.68 0.499 -.3511286 .7210993 ------------------------------------------------------------------------------ . . * with foreach loop . foreach lhs in y_lt2 y_lt3 y_lt4 { 2. logit `lhs' `rhs' 3. } Iteration 0: log likelihood = -883.91038 Iteration 1: log likelihood = -824.35787 Iteration 2: log likelihood = -819.6587 Iteration 3: log likelihood = -819.61993 Iteration 4: log likelihood = -819.61992 Logistic regression Number of obs = 2293 LR chi2(6) = 128.58 Prob > chi2 = 0.0000 Log likelihood = -819.61992 Pseudo R2 = 0.0727 ------------------------------------------------------------------------------ y_lt2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.9647422 .1542064 -6.26 0.000 -1.266981 -.6625033 male | .3053643 .1291546 2.36 0.018 .052226 .5585025 white | .5526576 .2305396 2.40 0.017 .1008082 1.004507 age | .0164704 .0040571 4.06 0.000 .0085187 .0244221 ed | -.1047962 .0253348 -4.14 0.000 -.1544516 -.0551409 prst | .0014112 .0056702 0.25 0.803 -.0097023 .0125246 _cons | -1.858405 .3958164 -4.70 0.000 -2.63419 -1.082619 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -1575.4005 Iteration 1: log likelihood = -1450.7598 Iteration 2: log likelihood = -1449.7869 Iteration 3: log likelihood = -1449.7863 Logistic regression Number of obs = 2293 LR chi2(6) = 251.23 Prob > chi2 = 0.0000 Log likelihood = -1449.7863 Pseudo R2 = 0.0797 ------------------------------------------------------------------------------ y_lt3 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.5654063 .0928433 -6.09 0.000 -.7473757 -.3834368 male | .6905423 .0898786 7.68 0.000 .5143834 .8667012 white | .3142708 .1405978 2.24 0.025 .0387042 .5898374 age | .0253345 .0028644 8.84 0.000 .0197203 .0309486 ed | -.0528527 .0184571 -2.86 0.004 -.0890279 -.0166774 prst | -.0095322 .0038184 -2.50 0.013 -.0170162 -.0020482 _cons | -.7303287 .269163 -2.71 0.007 -1.257878 -.202779 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -1499.7318 Iteration 1: log likelihood = -1429.2806 Iteration 2: log likelihood = -1428.6455 Iteration 3: log likelihood = -1428.6452 Logistic regression Number of obs = 2293 LR chi2(6) = 142.17 Prob > chi2 = 0.0000 Log likelihood = -1428.6452 Pseudo R2 = 0.0474 ------------------------------------------------------------------------------ y_lt4 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.3440164 .0920766 -3.74 0.000 -.5244832 -.1635495 male | .6559833 .0914586 7.17 0.000 .4767276 .8352389 white | .2260828 .1352834 1.67 0.095 -.0390678 .4912334 age | .0181817 .0029363 6.19 0.000 .0124266 .0239367 ed | -.0415125 .0192516 -2.16 0.031 -.0792449 -.00378 prst | -.006295 .0038444 -1.64 0.102 -.0138299 .0012399 _cons | .1849854 .2735326 0.68 0.499 -.3511286 .7210993 ------------------------------------------------------------------------------ . . * with additional commands . foreach lhs in y_lt2 y_lt3 y_lt4 { 2. tabulate `lhs' 3. logit `lhs' `rhs' 4. probit `lhs' `rhs' 5. } y_lt2 | Freq. Percent Cum. ------------+----------------------------------- 0 | 1,996 87.05 87.05 1 | 297 12.95 100.00 ------------+----------------------------------- Total | 2,293 100.00 Iteration 0: log likelihood = -883.91038 Iteration 1: log likelihood = -824.35787 Iteration 2: log likelihood = -819.6587 Iteration 3: log likelihood = -819.61993 Iteration 4: log likelihood = -819.61992 Logistic regression Number of obs = 2293 LR chi2(6) = 128.58 Prob > chi2 = 0.0000 Log likelihood = -819.61992 Pseudo R2 = 0.0727 ------------------------------------------------------------------------------ y_lt2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.9647422 .1542064 -6.26 0.000 -1.266981 -.6625033 male | .3053643 .1291546 2.36 0.018 .052226 .5585025 white | .5526576 .2305396 2.40 0.017 .1008082 1.004507 age | .0164704 .0040571 4.06 0.000 .0085187 .0244221 ed | -.1047962 .0253348 -4.14 0.000 -.1544516 -.0551409 prst | .0014112 .0056702 0.25 0.803 -.0097023 .0125246 _cons | -1.858405 .3958164 -4.70 0.000 -2.63419 -1.082619 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -883.91038 Iteration 1: log likelihood = -820.0978 Iteration 2: log likelihood = -818.71572 Iteration 3: log likelihood = -818.71278 Probit regression Number of obs = 2293 LR chi2(6) = 130.40 Prob > chi2 = 0.0000 Log likelihood = -818.71278 Pseudo R2 = 0.0738 ------------------------------------------------------------------------------ y_lt2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.5100307 .0782561 -6.52 0.000 -.6634099 -.3566516 male | .1693689 .0694431 2.44 0.015 .033263 .3054749 white | .2819498 .1183632 2.38 0.017 .0499623 .5139374 age | .0092023 .002179 4.22 0.000 .0049316 .013473 ed | -.0578632 .0137542 -4.21 0.000 -.084821 -.0309055 prst | .0006579 .0029821 0.22 0.825 -.0051868 .0065026 _cons | -1.073646 .2100896 -5.11 0.000 -1.485414 -.6618777 ------------------------------------------------------------------------------ y_lt3 | Freq. Percent Cum. ------------+----------------------------------- 0 | 1,273 55.52 55.52 1 | 1,020 44.48 100.00 ------------+----------------------------------- Total | 2,293 100.00 Iteration 0: log likelihood = -1575.4005 Iteration 1: log likelihood = -1450.7598 Iteration 2: log likelihood = -1449.7869 Iteration 3: log likelihood = -1449.7863 Logistic regression Number of obs = 2293 LR chi2(6) = 251.23 Prob > chi2 = 0.0000 Log likelihood = -1449.7863 Pseudo R2 = 0.0797 ------------------------------------------------------------------------------ y_lt3 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.5654063 .0928433 -6.09 0.000 -.7473757 -.3834368 male | .6905423 .0898786 7.68 0.000 .5143834 .8667012 white | .3142708 .1405978 2.24 0.025 .0387042 .5898374 age | .0253345 .0028644 8.84 0.000 .0197203 .0309486 ed | -.0528527 .0184571 -2.86 0.004 -.0890279 -.0166774 prst | -.0095322 .0038184 -2.50 0.013 -.0170162 -.0020482 _cons | -.7303287 .269163 -2.71 0.007 -1.257878 -.202779 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -1575.4005 Iteration 1: log likelihood = -1450.5416 Iteration 2: log likelihood = -1449.9255 Iteration 3: log likelihood = -1449.9255 Probit regression Number of obs = 2293 LR chi2(6) = 250.95 Prob > chi2 = 0.0000 Log likelihood = -1449.9255 Pseudo R2 = 0.0796 ------------------------------------------------------------------------------ y_lt3 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.3431124 .0566074 -6.06 0.000 -.4540608 -.2321639 male | .4245153 .0548217 7.74 0.000 .3170668 .5319638 white | .190951 .0850404 2.25 0.025 .0242748 .3576272 age | .015555 .0017433 8.92 0.000 .0121383 .0189718 ed | -.0319397 .0112858 -2.83 0.005 -.0540595 -.0098199 prst | -.0059387 .0023393 -2.54 0.011 -.0105238 -.0013537 _cons | -.4517883 .1644594 -2.75 0.006 -.7741228 -.1294538 ------------------------------------------------------------------------------ y_lt4 | Freq. Percent Cum. ------------+----------------------------------- 0 | 828 36.11 36.11 1 | 1,465 63.89 100.00 ------------+----------------------------------- Total | 2,293 100.00 Iteration 0: log likelihood = -1499.7318 Iteration 1: log likelihood = -1429.2806 Iteration 2: log likelihood = -1428.6455 Iteration 3: log likelihood = -1428.6452 Logistic regression Number of obs = 2293 LR chi2(6) = 142.17 Prob > chi2 = 0.0000 Log likelihood = -1428.6452 Pseudo R2 = 0.0474 ------------------------------------------------------------------------------ y_lt4 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.3440164 .0920766 -3.74 0.000 -.5244832 -.1635495 male | .6559833 .0914586 7.17 0.000 .4767276 .8352389 white | .2260828 .1352834 1.67 0.095 -.0390678 .4912334 age | .0181817 .0029363 6.19 0.000 .0124266 .0239367 ed | -.0415125 .0192516 -2.16 0.031 -.0792449 -.00378 prst | -.006295 .0038444 -1.64 0.102 -.0138299 .0012399 _cons | .1849854 .2735326 0.68 0.499 -.3511286 .7210993 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -1499.7318 Iteration 1: log likelihood = -1429.0649 Iteration 2: log likelihood = -1428.8369 Iteration 3: log likelihood = -1428.8369 Probit regression Number of obs = 2293 LR chi2(6) = 141.79 Prob > chi2 = 0.0000 Log likelihood = -1428.8369 Pseudo R2 = 0.0473 ------------------------------------------------------------------------------ y_lt4 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.2120439 .0563051 -3.77 0.000 -.3223999 -.101688 male | .3980483 .0553706 7.19 0.000 .2895238 .5065728 white | .1389495 .0830906 1.67 0.094 -.023905 .301804 age | .0110942 .0017761 6.25 0.000 .0076131 .0145753 ed | -.0235668 .0115593 -2.04 0.041 -.0462227 -.000911 prst | -.0040603 .002349 -1.73 0.084 -.0086643 .0005438 _cons | .1028786 .1660516 0.62 0.536 -.2225765 .4283337 ------------------------------------------------------------------------------ . . // #3 - not shown in text . // expanding the loop to add labels . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . foreach cutpt in 2 3 4 { 2. . * create binary outcome . generate y_lt`cutpt' = y<`cutpt' if !missing(y) 3. * add labels . label var y_lt`cutpt' "y is less than `cutpt'?" 4. label define y_lt`cutpt' 0 "Not<`cutpt'" 1 "Is<`cutpt'" 5. label val y_lt`cutpt' y_lt`cutpt' 6. * tabulate outcome . tabulate y_lt`cutpt' y 7. * estimate models . logit y_lt`cutpt' `rhs' 8. probit y_lt`cutpt' `rhs' 9. . } y is less | Artifical 6-category variable than 2? | 1_SD 2_D 3_D 4_SA 5 | Total -----------+-------------------------------------------------------+---------- Not<2 | 0 723 445 411 219 | 1,996 Is<2 | 297 0 0 0 0 | 297 -----------+-------------------------------------------------------+---------- Total | 297 723 445 411 219 | 2,293 | Artifical | 6-category y is less | variable than 2? | 6 | Total -----------+-----------+---------- Not<2 | 198 | 1,996 Is<2 | 0 | 297 -----------+-----------+---------- Total | 198 | 2,293 Iteration 0: log likelihood = -883.91038 Iteration 1: log likelihood = -824.35787 Iteration 2: log likelihood = -819.6587 Iteration 3: log likelihood = -819.61993 Iteration 4: log likelihood = -819.61992 Logistic regression Number of obs = 2293 LR chi2(6) = 128.58 Prob > chi2 = 0.0000 Log likelihood = -819.61992 Pseudo R2 = 0.0727 ------------------------------------------------------------------------------ y_lt2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.9647422 .1542064 -6.26 0.000 -1.266981 -.6625033 male | .3053643 .1291546 2.36 0.018 .052226 .5585025 white | .5526576 .2305396 2.40 0.017 .1008082 1.004507 age | .0164704 .0040571 4.06 0.000 .0085187 .0244221 ed | -.1047962 .0253348 -4.14 0.000 -.1544516 -.0551409 prst | .0014112 .0056702 0.25 0.803 -.0097023 .0125246 _cons | -1.858405 .3958164 -4.70 0.000 -2.63419 -1.082619 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -883.91038 Iteration 1: log likelihood = -820.0978 Iteration 2: log likelihood = -818.71572 Iteration 3: log likelihood = -818.71278 Probit regression Number of obs = 2293 LR chi2(6) = 130.40 Prob > chi2 = 0.0000 Log likelihood = -818.71278 Pseudo R2 = 0.0738 ------------------------------------------------------------------------------ y_lt2 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.5100307 .0782561 -6.52 0.000 -.6634099 -.3566516 male | .1693689 .0694431 2.44 0.015 .033263 .3054749 white | .2819498 .1183632 2.38 0.017 .0499623 .5139374 age | .0092023 .002179 4.22 0.000 .0049316 .013473 ed | -.0578632 .0137542 -4.21 0.000 -.084821 -.0309055 prst | .0006579 .0029821 0.22 0.825 -.0051868 .0065026 _cons | -1.073646 .2100896 -5.11 0.000 -1.485414 -.6618777 ------------------------------------------------------------------------------ y is less | Artifical 6-category variable than 3? | 1_SD 2_D 3_D 4_SA 5 | Total -----------+-------------------------------------------------------+---------- Not<3 | 0 0 445 411 219 | 1,273 Is<3 | 297 723 0 0 0 | 1,020 -----------+-------------------------------------------------------+---------- Total | 297 723 445 411 219 | 2,293 | Artifical | 6-category y is less | variable than 3? | 6 | Total -----------+-----------+---------- Not<3 | 198 | 1,273 Is<3 | 0 | 1,020 -----------+-----------+---------- Total | 198 | 2,293 Iteration 0: log likelihood = -1575.4005 Iteration 1: log likelihood = -1450.7598 Iteration 2: log likelihood = -1449.7869 Iteration 3: log likelihood = -1449.7863 Logistic regression Number of obs = 2293 LR chi2(6) = 251.23 Prob > chi2 = 0.0000 Log likelihood = -1449.7863 Pseudo R2 = 0.0797 ------------------------------------------------------------------------------ y_lt3 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.5654063 .0928433 -6.09 0.000 -.7473757 -.3834368 male | .6905423 .0898786 7.68 0.000 .5143834 .8667012 white | .3142708 .1405978 2.24 0.025 .0387042 .5898374 age | .0253345 .0028644 8.84 0.000 .0197203 .0309486 ed | -.0528527 .0184571 -2.86 0.004 -.0890279 -.0166774 prst | -.0095322 .0038184 -2.50 0.013 -.0170162 -.0020482 _cons | -.7303287 .269163 -2.71 0.007 -1.257878 -.202779 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -1575.4005 Iteration 1: log likelihood = -1450.5416 Iteration 2: log likelihood = -1449.9255 Iteration 3: log likelihood = -1449.9255 Probit regression Number of obs = 2293 LR chi2(6) = 250.95 Prob > chi2 = 0.0000 Log likelihood = -1449.9255 Pseudo R2 = 0.0796 ------------------------------------------------------------------------------ y_lt3 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.3431124 .0566074 -6.06 0.000 -.4540608 -.2321639 male | .4245153 .0548217 7.74 0.000 .3170668 .5319638 white | .190951 .0850404 2.25 0.025 .0242748 .3576272 age | .015555 .0017433 8.92 0.000 .0121383 .0189718 ed | -.0319397 .0112858 -2.83 0.005 -.0540595 -.0098199 prst | -.0059387 .0023393 -2.54 0.011 -.0105238 -.0013537 _cons | -.4517883 .1644594 -2.75 0.006 -.7741228 -.1294538 ------------------------------------------------------------------------------ y is less | Artifical 6-category variable than 4? | 1_SD 2_D 3_D 4_SA 5 | Total -----------+-------------------------------------------------------+---------- Not<4 | 0 0 0 411 219 | 828 Is<4 | 297 723 445 0 0 | 1,465 -----------+-------------------------------------------------------+---------- Total | 297 723 445 411 219 | 2,293 | Artifical | 6-category y is less | variable than 4? | 6 | Total -----------+-----------+---------- Not<4 | 198 | 828 Is<4 | 0 | 1,465 -----------+-----------+---------- Total | 198 | 2,293 Iteration 0: log likelihood = -1499.7318 Iteration 1: log likelihood = -1429.2806 Iteration 2: log likelihood = -1428.6455 Iteration 3: log likelihood = -1428.6452 Logistic regression Number of obs = 2293 LR chi2(6) = 142.17 Prob > chi2 = 0.0000 Log likelihood = -1428.6452 Pseudo R2 = 0.0474 ------------------------------------------------------------------------------ y_lt4 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.3440164 .0920766 -3.74 0.000 -.5244832 -.1635495 male | .6559833 .0914586 7.17 0.000 .4767276 .8352389 white | .2260828 .1352834 1.67 0.095 -.0390678 .4912334 age | .0181817 .0029363 6.19 0.000 .0124266 .0239367 ed | -.0415125 .0192516 -2.16 0.031 -.0792449 -.00378 prst | -.006295 .0038444 -1.64 0.102 -.0138299 .0012399 _cons | .1849854 .2735326 0.68 0.499 -.3511286 .7210993 ------------------------------------------------------------------------------ Iteration 0: log likelihood = -1499.7318 Iteration 1: log likelihood = -1429.0649 Iteration 2: log likelihood = -1428.8369 Iteration 3: log likelihood = -1428.8369 Probit regression Number of obs = 2293 LR chi2(6) = 141.79 Prob > chi2 = 0.0000 Log likelihood = -1428.8369 Pseudo R2 = 0.0473 ------------------------------------------------------------------------------ y_lt4 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- yr89 | -.2120439 .0563051 -3.77 0.000 -.3223999 -.101688 male | .3980483 .0553706 7.19 0.000 .2895238 .5065728 white | .1389495 .0830906 1.67 0.094 -.023905 .301804 age | .0110942 .0017761 6.25 0.000 .0076131 .0145753 ed | -.0235668 .0115593 -2.04 0.041 -.0462227 -.000911 prst | -.0040603 .002349 -1.73 0.084 -.0086643 .0005438 _cons | .1028786 .1660516 0.62 0.536 -.2225765 .4283337 ------------------------------------------------------------------------------ . . // #4 - Loop Example 1 . // listing variables and value labels . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . * illustrate an extended macro function . local varlabel : variable label warm . display "Variable label for warm: `varlabel'" Variable label for warm: Mom can have warm relations with child . . * loop through variables and print results . foreach varname of varlist warm yr89 male white age ed prst { 2. local varlabel : variable label `varname' 3. display "`varname'" _col(12) "`varlabel'" 4. } warm Mom can have warm relations with child yr89 Survey year: 1=1989 0=1977 male Gender: 1=male 0=female white Race: 1=white 0=not white age Age in years ed Years of education prst Occupational prestige . . // #5 - Loop Example 2 . // creating interactions . . * interactions with simple variable labels . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . foreach varname of varlist yr89 white age ed prst { 2. generate maleX`varname' = male*`varname' 3. label var maleX`varname' "male*`varname'" 4. } . codebook maleX*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- maleXyr89 2293 2 .1766245 0 1 male*yr89 maleXwhite 2293 2 .4147405 0 1 male*white maleXage 2293 71 20.50807 0 89 male*age maleXed 2293 21 5.735717 0 20 male*ed maleXprst 2293 59 18.76625 0 82 male*prst -------------------------------------------------------------------------------- . . * interactions with detailed variable labels . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . foreach varname of varlist yr89 white age ed prst { 2. local varlabel : variable label `varname' 3. generate maleX`varname' = male*`varname' 4. label var maleX`varname' "male*`varlabel'" 5. } . codebook maleX*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- maleXyr89 2293 2 .1766245 0 1 male*Survey year: 1=1989 0=1977 maleXwhite 2293 2 .4147405 0 1 male*Race: 1=white 0=not white maleXage 2293 71 20.50807 0 89 male*Age in years maleXed 2293 21 5.735717 0 20 male*Years of education maleXprst 2293 59 18.76625 0 82 male*Occupational prestige -------------------------------------------------------------------------------- . . * interactions with detailed variable labels and names . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . foreach varname of varlist yr89 white age ed prst { 2. local varlabel : variable label `varname' 3. generate maleX`varname' = male*`varname' 4. label var maleX`varname' "male*`varname' (`varlabel')" 5. } . codebook maleX*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- maleXyr89 2293 2 .1766245 0 1 male*yr89 (Survey year: 1=1989 0... maleXwhite 2293 2 .4147405 0 1 male*white (Race: 1=white 0=not ... maleXage 2293 71 20.50807 0 89 male*age (Age in years) maleXed 2293 21 5.735717 0 20 male*ed (Years of education) maleXprst 2293 59 18.76625 0 82 male*prst (Occupational prestige) -------------------------------------------------------------------------------- . . // #6 - Loop Example 3 . // models with alternative measures of education . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . local edvars "edyrs edgths edgtcol edsqrtyrs edlths" . local rhs "male white age prst yr89" . . foreach edvarname of varlist `edvars' { 2. display _newline "==> education variable: `edvarname'" 3. ologit warm `edvarname' `rhs' 4. } ==> education variable: edyrs Iteration 0: log likelihood = -2995.7704 Iteration 1: log likelihood = -2846.4532 Iteration 2: log likelihood = -2844.9142 Iteration 3: log likelihood = -2844.9123 Ordered logistic regression Number of obs = 2293 LR chi2(6) = 301.72 Prob > chi2 = 0.0000 Log likelihood = -2844.9123 Pseudo R2 = 0.0504 ------------------------------------------------------------------------------ warm | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- edyrs | .0671728 .015975 4.20 0.000 .0358624 .0984831 male | -.7332997 .0784827 -9.34 0.000 -.8871229 -.5794766 white | -.3911595 .1183808 -3.30 0.001 -.6231815 -.1591374 age | -.0216655 .0024683 -8.78 0.000 -.0265032 -.0168278 prst | .0060727 .0032929 1.84 0.065 -.0003813 .0125267 yr89 | .5239025 .0798988 6.56 0.000 .3673037 .6805013 -------------+---------------------------------------------------------------- /cut1 | -2.465362 .2389126 -2.933622 -1.997102 /cut2 | -.630904 .2333155 -1.088194 -.173614 /cut3 | 1.261854 .2340179 .8031873 1.720521 ------------------------------------------------------------------------------ ==> education variable: edgths Iteration 0: log likelihood = -2995.7704 Iteration 1: log likelihood = -2844.4882 Iteration 2: log likelihood = -2842.92 Iteration 3: log likelihood = -2842.9181 Ordered logistic regression Number of obs = 2293 LR chi2(6) = 305.70 Prob > chi2 = 0.0000 Log likelihood = -2842.9181 Pseudo R2 = 0.0510 ------------------------------------------------------------------------------ warm | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- edgths | .4336817 .0932048 4.65 0.000 .2510036 .6163598 male | -.7501362 .0786274 -9.54 0.000 -.904243 -.5960294 white | -.338431 .1178826 -2.87 0.004 -.5694766 -.1073854 age | -.0230957 .0023678 -9.75 0.000 -.0277365 -.018455 prst | .0072846 .0030621 2.38 0.017 .001283 .0132862 yr89 | .5284706 .0797644 6.63 0.000 .3721354 .6848059 -------------+---------------------------------------------------------------- /cut1 | -3.104654 .1924467 -3.481842 -2.727465 /cut2 | -1.27217 .1821118 -1.629102 -.9152373 /cut3 | .6261563 .1810119 .2713796 .9809331 ------------------------------------------------------------------------------ ==> education variable: edgtcol Iteration 0: log likelihood = -2995.7704 Iteration 1: log likelihood = -2851.7251 Iteration 2: log likelihood = -2850.3162 Iteration 3: log likelihood = -2850.3146 Ordered logistic regression Number of obs = 2293 LR chi2(6) = 290.91 Prob > chi2 = 0.0000 Log likelihood = -2850.3146 Pseudo R2 = 0.0486 ------------------------------------------------------------------------------ warm | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- edgtcol | .4147723 .1575721 2.63 0.008 .1059367 .7236078 male | -.7467216 .0786991 -9.49 0.000 -.9009689 -.5924742 white | -.351685 .1178551 -2.98 0.003 -.5826767 -.1206933 age | -.0249174 .0023295 -10.70 0.000 -.0294832 -.0203517 prst | .010758 .0029663 3.63 0.000 .0049441 .0165719 yr89 | .5548804 .0794114 6.99 0.000 .399237 .7105238 -------------+---------------------------------------------------------------- /cut1 | -3.166425 .1959627 -3.550505 -2.782346 /cut2 | -1.342651 .1856695 -1.706556 -.9787453 /cut3 | .5476301 .1842879 .1864324 .9088278 ------------------------------------------------------------------------------ ==> education variable: edsqrtyrs Iteration 0: log likelihood = -2995.7704 Iteration 1: log likelihood = -2847.2968 Iteration 2: log likelihood = -2845.774 Iteration 3: log likelihood = -2845.7721 Ordered logistic regression Number of obs = 2293 LR chi2(6) = 300.00 Prob > chi2 = 0.0000 Log likelihood = -2845.7721 Pseudo R2 = 0.0501 ------------------------------------------------------------------------------ warm | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- edsqrtyrs | .4017779 .1007724 3.99 0.000 .2042677 .5992882 male | -.7253242 .0784722 -9.24 0.000 -.879127 -.5715214 white | -.4017311 .1186862 -3.38 0.001 -.6343518 -.1691105 age | -.0217627 .0024747 -8.79 0.000 -.026613 -.0169124 prst | .0072201 .0031917 2.26 0.024 .0009646 .0134757 yr89 | .5261914 .0798683 6.59 0.000 .3696524 .6827304 -------------+---------------------------------------------------------------- /cut1 | -1.858452 .3578616 -2.559848 -1.157056 /cut2 | -.0242802 .3561258 -.7222739 .6737135 /cut3 | 1.867063 .3572798 1.166807 2.567319 ------------------------------------------------------------------------------ ==> education variable: edlths Iteration 0: log likelihood = -2995.7704 Iteration 1: log likelihood = -2852.708 Iteration 2: log likelihood = -2851.3043 Iteration 3: log likelihood = -2851.3027 Ordered logistic regression Number of obs = 2293 LR chi2(6) = 288.94 Prob > chi2 = 0.0000 Log likelihood = -2851.3027 Pseudo R2 = 0.0482 ------------------------------------------------------------------------------ warm | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- edlths | -.217821 .0977358 -2.23 0.026 -.4093798 -.0262623 male | -.7212474 .0785234 -9.19 0.000 -.8751505 -.5673443 white | -.3705813 .1180965 -3.14 0.002 -.6020462 -.1391165 age | -.0234153 .0024519 -9.55 0.000 -.0282209 -.0186097 prst | .0114971 .0029235 3.93 0.000 .0057671 .0172271 yr89 | .5421916 .0798128 6.79 0.000 .3857614 .6986219 -------------+---------------------------------------------------------------- /cut1 | -3.176745 .1984827 -3.565764 -2.787726 /cut2 | -1.350006 .1879129 -1.718308 -.9817031 /cut3 | .5377913 .1866501 .1719639 .9036187 ------------------------------------------------------------------------------ . . // #7 - Loop Example 4 . // recoding variables . . * recode social distance measures . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . local sdvars "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" . . foreach varname of varlist `sdvars' { 2. generate B`varname' = `varname' 3. label var B`varname' "`varname': (1,2)=0 (3,4)=1" 4. replace B`varname' = 0 if `varname'==1 | `varname'==2 5. replace B`varname' = 1 if `varname'==3 | `varname'==4 6. } (1803 missing values generated) (395 real changes made) (95 real changes made) (1805 missing values generated) (354 real changes made) (134 real changes made) (1812 missing values generated) (136 real changes made) (345 real changes made) (1806 missing values generated) (347 real changes made) (140 real changes made) (1808 missing values generated) (334 real changes made) (151 real changes made) (1838 missing values generated) (215 real changes made) (240 real changes made) . codebook B*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- Bsdneighb 490 2 .1938776 0 1 sdneighb: (1,2)=0 (3,4)=1 Bsdsocial 488 2 .2745902 0 1 sdsocial: (1,2)=0 (3,4)=1 Bsdchild 481 2 .7172557 0 1 sdchild: (1,2)=0 (3,4)=1 Bsdfriend 487 2 .2874743 0 1 sdfriend: (1,2)=0 (3,4)=1 Bsdwork 485 2 .3113402 0 1 sdwork: (1,2)=0 (3,4)=1 Bsdmarry 455 2 .5274725 0 1 sdmarry: (1,2)=0 (3,4)=1 -------------------------------------------------------------------------------- . . * transform income from five panels using loops over names . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . foreach varname of varlist incp1 incp2 incp3 incp4 incp5 { 2. generate ln`varname' = ln(`varname'+.5) 3. label var ln`varname' "Log(`varname'+.5)" 4. } (1540 missing values generated) (1540 missing values generated) (1540 missing values generated) (1540 missing values generated) (1540 missing values generated) . . * transform income from five panels using loops over panel . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . foreach panelnum in 1 2 3 4 5 { 2. generate lnincp`panelnum' = ln(incp`panelnum'+.5) 3. label var lnincp`panelnum' "Log(incp`panelnum'+.5)" 4. } (1540 missing values generated) (1540 missing values generated) (1540 missing values generated) (1540 missing values generated) (1540 missing values generated) . . // #8 - Loop Example 5 . // creating a macro that holds accumulated information . . local varlist "" . forvalues panelnum = 1/20 { 2. local varlist "`varlist' incp`panelnum'" 3. } . display "varlist is: `varlist'" varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp11 > incp12 incp13 incp14 incp15 incp16 incp17 incp18 incp19 incp20 . . * expanded version . local varlist "" . forvalues panelnum = 1/20 { 2. local varlist "`varlist' incp`panelnum'" 3. display _newline "panelnum is: `panelnum'" 4. display "varlist is: `varlist'" 5. } panelnum is: 1 varlist is: incp1 panelnum is: 2 varlist is: incp1 incp2 panelnum is: 3 varlist is: incp1 incp2 incp3 panelnum is: 4 varlist is: incp1 incp2 incp3 incp4 panelnum is: 5 varlist is: incp1 incp2 incp3 incp4 incp5 panelnum is: 6 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 panelnum is: 7 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 panelnum is: 8 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 panelnum is: 9 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 panelnum is: 10 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 panelnum is: 11 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 panelnum is: 12 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 panelnum is: 13 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 panelnum is: 14 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 panelnum is: 15 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 incp15 panelnum is: 16 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 incp15 incp16 panelnum is: 17 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 incp15 incp16 incp17 panelnum is: 18 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 incp15 incp16 incp17 incp18 panelnum is: 19 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 incp15 incp16 incp17 incp18 incp19 panelnum is: 20 varlist is: incp1 incp2 incp3 incp4 incp5 incp6 incp7 incp8 incp9 incp10 incp1 > 1 incp12 incp13 incp14 incp15 incp16 incp17 incp18 incp19 incp20 . . // #9 - Loop Example #6 . // retrieving information returned by Stata . . // computing the percent of ones . . * first, recreate the binary variables . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . local sdvars "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" . . foreach varname of varlist `sdvars' { 2. generate B`varname' = `varname' 3. replace B`varname' = 0 if `varname'==1 | `varname'==2 4. replace B`varname' = 1 if `varname'==3 | `varname'==4 5. } (1803 missing values generated) (395 real changes made) (95 real changes made) (1805 missing values generated) (354 real changes made) (134 real changes made) (1812 missing values generated) (136 real changes made) (345 real changes made) (1806 missing values generated) (347 real changes made) (140 real changes made) (1808 missing values generated) (334 real changes made) (151 real changes made) (1838 missing values generated) (215 real changes made) (240 real changes made) . . * compute the mean and see what is returned . summarize Bsdneighb Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- Bsdneighb | 490 .1938776 .3957381 0 1 . return list scalars: r(N) = 490 r(sum_w) = 490 r(mean) = .1938775510204082 r(Var) = .1566086557322316 r(sd) = .3957381150865198 r(min) = 0 r(max) = 1 r(sum) = 95 . . * compute the percent of one's . local pct1 = r(mean)*100 . display "The percent of ones is: `pct1'" The percent of ones is: 19.38775510204082 . . * using a loop to examine multiple variables . . foreach varname of varlist `sdvars' { 2. quietly summarize B`varname' 3. local samplesize = r(N) 4. local pct1 = r(mean)*100 5. display "B`varname':" _col(14) "Pct1 = " %5.2f `pct1' /// > _col(30) "N = `samplesize'" 6. } Bsdneighb: Pct1 = 19.39 N = 490 Bsdsocial: Pct1 = 27.46 N = 488 Bsdchild: Pct1 = 71.73 N = 481 Bsdfriend: Pct1 = 28.75 N = 487 Bsdwork: Pct1 = 31.13 N = 485 Bsdmarry: Pct1 = 52.75 N = 455 . . // computing the coefficient of variation . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . * note that the variables differ but the CV are the same . foreach varname of varlist incp1 incp2 incp3 incp4 { 2. quietly summarize `varname' 3. local cv = r(sd)/r(mean) 4. display "CV for `varname': " %8.3f `cv' 5. } CV for incp1: 0.538 CV for incp2: 0.538 CV for incp3: 0.538 CV for incp4: 0.538 . . // #10 - extending Loop Example 1 . // counters . . * version 1 . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . local counter = 0 . foreach varname of varlist warm yr89 male white age ed prst { 2. local counter = `counter' + 1 3. local varlabel : variable label `varname' 4. display "`counter'. `varname'" _col(12) "`varlabel'" 5. } 1. warm Mom can have warm relations with child 2. yr89 Survey year: 1=1989 0=1977 3. male Gender: 1=male 0=female 4. white Race: 1=white 0=not white 5. age Age in years 6. ed Years of education 7. prst Occupational prestige . . * version 2 . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . local counter = 0 . foreach varname of varlist warm yr89 male white age ed prst { 2. local ++counter 3. local varlabel : variable label `varname' 4. display "`counter'. `varname'" _col(12) "`varlabel'" 5. } 1. warm Mom can have warm relations with child 2. yr89 Survey year: 1=1989 0=1977 3. male Gender: 1=male 0=female 4. white Race: 1=white 0=not white 5. age Age in years 6. ed Years of education 7. prst Occupational prestige . . // #11 - Loop Example #6 . // loops for saving results to matrices . . * first, recreate the binary variables . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . local sdvars "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" . . foreach varname of varlist `sdvars' { 2. generate B`varname' = `varname' 3. replace B`varname' = 0 if `varname'==1 | `varname'==2 4. replace B`varname' = 1 if `varname'==3 | `varname'==4 5. } (1803 missing values generated) (395 real changes made) (95 real changes made) (1805 missing values generated) (354 real changes made) (134 real changes made) (1812 missing values generated) (136 real changes made) (345 real changes made) (1806 missing values generated) (347 real changes made) (140 real changes made) (1808 missing values generated) (334 real changes made) (151 real changes made) (1838 missing values generated) (215 real changes made) (240 real changes made) . . * compute percent 1s and N for each variable . local sdvars "Bsdneighb Bsdsocial Bsdchild Bsdfriend Bsdwork Bsdmarry" . local nvars : word count `sdvars' . matrix stats = J(`nvars',2,.) . matrix list stats stats[6,2] c1 c2 r1 . . r2 . . r3 . . r4 . . r5 . . r6 . . . matrix colnames stats = Pct1s N . matrix rownames stats = `sdvars' . matrix list stats stats[6,2] Pct1s N Bsdneighb . . Bsdsocial . . Bsdchild . . Bsdfriend . . Bsdwork . . Bsdmarry . . . . local irow = 0 . foreach varname of varlist `sdvars' { 2. local ++irow 3. quietly summarize `varname' 4. local samplesize = r(N) 5. local pct1 = r(mean)*100 6. matrix stats[`irow',1] = `pct1' 7. matrix stats[`irow',2] = `samplesize' 8. } . . matrix list stats, format(%9.3f) stats[6,2] Pct1s N Bsdneighb 19.388 490.000 Bsdsocial 27.459 488.000 Bsdchild 71.726 481.000 Bsdfriend 28.747 487.000 Bsdwork 31.134 485.000 Bsdmarry 52.747 455.000 . . // #12 . // nested loops . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . foreach yvar of varlist ya yb yc yd { // loop 1 begins 2. foreach cutpt in 2 3 4 { // loop 2 begins 3. * create binary variable . generate `yvar'_lt`cutpt' = y<`cutpt' if !missing(y) 4. * add labels . label var `yvar'_lt`cutpt' "y is less than `cutpt'?" 5. label define `yvar'_lt`cutpt' 0 "Not<`cutpt'" 1 "Is<`cutpt'" 6. label val `yvar'_lt`cutpt' `yvar'_lt`cutpt' 7. } // loop 2 ends 8. } // loop 1 ends . . tab1 ya ya_lt2 ya_lt3 ya_lt4 -> tabulation of ya Mom can | have warm | relations | with child | Freq. Percent Cum. ------------+----------------------------------- 1_SD | 297 12.95 12.95 2_D | 723 31.53 44.48 3_D | 856 37.33 81.81 4_SA | 417 18.19 100.00 ------------+----------------------------------- Total | 2,293 100.00 -> tabulation of ya_lt2 y is less | than 2? | Freq. Percent Cum. ------------+----------------------------------- Not<2 | 1,996 87.05 87.05 Is<2 | 297 12.95 100.00 ------------+----------------------------------- Total | 2,293 100.00 -> tabulation of ya_lt3 y is less | than 3? | Freq. Percent Cum. ------------+----------------------------------- Not<3 | 1,273 55.52 55.52 Is<3 | 1,020 44.48 100.00 ------------+----------------------------------- Total | 2,293 100.00 -> tabulation of ya_lt4 y is less | than 4? | Freq. Percent Cum. ------------+----------------------------------- Not<4 | 828 36.11 36.11 Is<4 | 1,465 63.89 100.00 ------------+----------------------------------- Total | 2,293 100.00 . tab1 yb yb_lt2 yb_lt3 yb_lt4 -> tabulation of yb Mom can | have warm | relations | with child | Freq. Percent Cum. ------------+----------------------------------- 1_SD | 297 12.95 12.95 2_D | 723 31.53 44.48 3_D | 856 37.33 81.81 4_SA | 417 18.19 100.00 ------------+----------------------------------- Total | 2,293 100.00 -> tabulation of yb_lt2 y is less | than 2? | Freq. Percent Cum. ------------+----------------------------------- Not<2 | 1,996 87.05 87.05 Is<2 | 297 12.95 100.00 ------------+----------------------------------- Total | 2,293 100.00 -> tabulation of yb_lt3 y is less | than 3? | Freq. Percent Cum. ------------+----------------------------------- Not<3 | 1,273 55.52 55.52 Is<3 | 1,020 44.48 100.00 ------------+----------------------------------- Total | 2,293 100.00 -> tabulation of yb_lt4 y is less | than 4? | Freq. Percent Cum. ------------+----------------------------------- Not<4 | 828 36.11 36.11 Is<4 | 1,465 63.89 100.00 ------------+----------------------------------- Total | 2,293 100.00 . . log close log: D:\wf\work\wf4-loops.log log type: text closed on: 24 Oct 2008, 09:41:05 -------------------------------------------------------------------------------- . exit end of do-file . do wf4-loops-error1.do, nostop . capture log close . log using wf4-loops-error1, replace text (note: file D:\wf\work\wf4-loops-error1.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-loops-error1.log log type: text opened on: 24 Oct 2008, 09:41:05 . . // program: wf4-loops-error1.do \ for stata 9 . // task: error in a loop command . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . // #2 . // loop command . . foreach varname in "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" { 2. gen B`varname' = `varname' 3. replace B`varname' = 0 if `varname'==1 | `varname'==2 4. replace B`varname' = 1 if `varname'==3 | `varname'==4 5. } sdsocial already defined r(110); . . log close log: D:\wf\work\wf4-loops-error1.log log type: text closed on: 24 Oct 2008, 09:41:05 -------------------------------------------------------------------------------- . exit end of do-file . do wf4-loops-error1a.do, nostop . capture log close . log using wf4-loops-error1a, replace text (note: file D:\wf\work\wf4-loops-error1a.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-loops-error1a.log log type: text opened on: 24 Oct 2008, 09:41:05 . . // program: wf4-loops-error1a.do \ for stata 9 . // task: error in a loop command . // - removing sdsocial from the list . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . // #2 . // loop command . . foreach varname in "sdneighb sdchild sdfriend sdwork sdmarry" { 2. gen B`varname' = `varname' 3. replace B`varname' = 0 if `varname'==1 | `varname'==2 4. replace B`varname' = 1 if `varname'==3 | `varname'==4 5. } sdchild already defined r(110); . . log close log: D:\wf\work\wf4-loops-error1a.log log type: text closed on: 24 Oct 2008, 09:41:05 -------------------------------------------------------------------------------- . exit end of do-file . do wf4-loops-error1b.do, nostop . capture log close . log using wf4-loops-error1b, replace text (note: file D:\wf\work\wf4-loops-error1b.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-loops-error1b.log log type: text opened on: 24 Oct 2008, 09:41:05 . . // program: wf4-loops-error1b.do \ for stata 9 . // task: error in a loop command . // - using display to debug . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . // #2 . // loop command . . foreach varname in "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" { 2. display "==> varname is: >`varname'<" 3. gen B`varname' = `varname' 4. replace B`varname' = 0 if `varname'==1 | `varname'==2 5. replace B`varname' = 1 if `varname'==3 | `varname'==4 6. } ==> varname is: >sdneighb sdsocial sdchild sdfriend sdwork sdmarry< sdsocial already defined r(110); . . log close log: D:\wf\work\wf4-loops-error1b.log log type: text closed on: 24 Oct 2008, 09:41:05 -------------------------------------------------------------------------------- . exit end of do-file . do wf4-loops-error1c.do, nostop . capture log close . log using wf4-loops-error1c, replace text (note: file D:\wf\work\wf4-loops-error1c.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-loops-error1c.log log type: text opened on: 24 Oct 2008, 09:41:05 . . // program: wf4-loops-error1c.do \ for stata 9 . // task: error in a loop command . // - . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-loops, clear (Workflow data for illustrating loops \ 2008-04-02) . . // #2 . // loop command . . set trace on . foreach varname in "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" { 2. gen B`varname' = `varname' 3. replace B`varname' = 0 if `varname'==1 | `varname'==2 4. replace B`varname' = 1 if `varname'==3 | `varname'==4 5. } - foreach varname in "sdneighb sdsocial sdchild sdfriend sdwork sdmarry" { - gen B`varname' = `varname' = gen Bsdneighb sdsocial sdchild sdfriend sdwork sdmarry = sdneighb sdsocial > sdchild sdfriend sdwork sdmarry sdsocial already defined replace B`varname' = 0 if `varname'==1 | `varname'==2 replace B`varname' = 1 if `varname'==3 | `varname'==4 } r(110); . . log close log: D:\wf\work\wf4-loops-error1c.log log type: text closed on: 24 Oct 2008, 09:41:05 -------------------------------------------------------------------------------- . exit end of do-file . . * including files . do wf4-include.do, nostop . capture log close . log using wf4-include, replace text (note: file D:\wf\work\wf4-include.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf4-include.log log type: text opened on: 24 Oct 2008, 09:41:05 . . // program: wf4-include.do \ for stata 9 . // include: requires wf4-include-2digit-recode.doi . // & wf4-include-3digit-recode.doi . // task: examples of include command . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // recode variables without include and without loops . . use wf-include, clear (Workflow data to illustrate include command \ 2008-04-02) . . // recode two digit missing values . . * inneighb: recode 97, 98 & 99 . clonevar inneighbR = inneighb . replace inneighbR = .a if inneighbR==97 (0 real changes made) . replace inneighbR = .b if inneighbR==98 (10 real changes made, 10 to missing) . replace inneighbR = .c if inneighbR==99 (1793 real changes made, 1793 to missing) . tabulate inneighb inneighbR, miss nolabel Q13 Would | have X as | Q13 Would have X as neighbor neighbor | 1 2 3 4 .b | Total -----------+-------------------------------------------------------+---------- 1 | 183 0 0 0 0 | 183 2 | 0 212 0 0 0 | 212 3 | 0 0 68 0 0 | 68 4 | 0 0 0 27 0 | 27 98 | 0 0 0 0 10 | 10 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 183 212 68 27 10 | 2,293 | Q13 Would Q13 Would | have X as have X as | neighbor neighbor | .c | Total -----------+-----------+---------- 1 | 0 | 183 2 | 0 | 212 3 | 0 | 68 4 | 0 | 27 98 | 0 | 10 99 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . * insocial: recode 97, 98 & 99 . clonevar insocialR = insocial . replace insocialR = .a if insocialR==97 (0 real changes made) . replace insocialR = .b if insocialR==98 (12 real changes made, 12 to missing) . replace insocialR = .c if insocialR==99 (1793 real changes made, 1793 to missing) . tabulate insocial insocialR, miss nolabel Q14 Would | socialize | Q14 Would socialize w X w X | 1 2 3 4 .b | Total -----------+-------------------------------------------------------+---------- 1 | 147 0 0 0 0 | 147 2 | 0 207 0 0 0 | 207 3 | 0 0 92 0 0 | 92 4 | 0 0 0 42 0 | 42 98 | 0 0 0 0 12 | 12 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 147 207 92 42 12 | 2,293 | Q14 Would Q14 Would | socialize socialize | w X w X | .c | Total -----------+-----------+---------- 1 | 0 | 147 2 | 0 | 207 3 | 0 | 92 4 | 0 | 42 98 | 0 | 12 99 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . * inchild: recode 97, 98 & 99 . clonevar inchildR = inchild . replace inchildR = .a if inchildR==97 (1 real change made, 1 to missing) . replace inchildR = .b if inchildR==98 (18 real changes made, 18 to missing) . replace inchildR = .c if inchildR==99 (1793 real changes made, 1793 to missing) . tabulate inchild inchildR, miss nolabel Q15 Would | let X care | for | Q15 Would let X care for children children | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 54 0 0 0 0 | 54 2 | 0 82 0 0 0 | 82 3 | 0 0 146 0 0 | 146 4 | 0 0 0 199 0 | 199 97 | 0 0 0 0 1 | 1 98 | 0 0 0 0 0 | 18 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 54 82 146 199 1 | 2,293 Q15 Would | let X care | Q15 Would let X care for | for children children | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 54 2 | 0 0 | 82 3 | 0 0 | 146 4 | 0 0 | 199 97 | 0 0 | 1 98 | 18 0 | 18 99 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 18 1,793 | 2,293 . * infriend: recode 997 998 999 . clonevar infriendR = infriend . replace infriendR = .a if infriendR==97 (0 real changes made) . replace infriendR = .b if infriendR==98 (0 real changes made) . replace infriendR = .c if infriendR==99 (0 real changes made) . tabulate infriend infriendR, miss nolabel Q16 Would | be friends | Q16 Would be friends w X w X | 1 2 3 4 998 | Total -----------+-------------------------------------------------------+---------- 1 | 161 0 0 0 0 | 161 2 | 0 186 0 0 0 | 186 3 | 0 0 99 0 0 | 99 4 | 0 0 0 41 0 | 41 998 | 0 0 0 0 13 | 13 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 161 186 99 41 13 | 2,293 | Q16 Would Q16 Would | be friends be friends | w X w X | 999 | Total -----------+-----------+---------- 1 | 0 | 161 2 | 0 | 186 3 | 0 | 99 4 | 0 | 41 998 | 0 | 13 999 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . . // recode three digit missing values . . * inmarry: recode 997 998 999 . clonevar inmarryR = inmarry . replace inmarryR = .a if inmarryR==997 (2 real changes made, 2 to missing) . replace inmarryR = .b if inmarryR==998 (43 real changes made, 43 to missing) . replace inmarryR = .c if inmarryR==999 (1793 real changes made, 1793 to missing) . tabulate inmarry inmarryR, miss nolabel Q18 Would | let X | marry | Q18 Would let X marry relative relative | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 81 0 0 0 0 | 81 2 | 0 134 0 0 0 | 134 3 | 0 0 117 0 0 | 117 4 | 0 0 0 123 0 | 123 997 | 0 0 0 0 2 | 2 998 | 0 0 0 0 0 | 43 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 81 134 117 123 2 | 2,293 Q18 Would | let X | Q18 Would let X marry marry | relative relative | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 81 2 | 0 0 | 134 3 | 0 0 | 117 4 | 0 0 | 123 997 | 0 0 | 2 998 | 43 0 | 43 999 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 43 1,793 | 2,293 . * inwork: recode 997 998 999 . clonevar inworkR = inwork . replace inworkR = .a if inworkR==997 (1 real change made, 1 to missing) . replace inworkR = .b if inworkR==998 (14 real changes made, 14 to missing) . replace inworkR = .c if inworkR==999 (1793 real changes made, 1793 to missing) . tabulate inwork inworkR, miss nolabel Q17 Would | work w X | Q17 Would work w X on job on job | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 133 0 0 0 0 | 133 2 | 0 201 0 0 0 | 201 3 | 0 0 92 0 0 | 92 4 | 0 0 0 59 0 | 59 997 | 0 0 0 0 1 | 1 998 | 0 0 0 0 0 | 14 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 133 201 92 59 1 | 2,293 Q17 Would | Q17 Would work w X on work w X | job on job | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 133 2 | 0 0 | 201 3 | 0 0 | 92 4 | 0 0 | 59 997 | 0 0 | 1 998 | 14 0 | 14 999 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 14 1,793 | 2,293 . . // #2 . // recode variables without include using a loop . . use wf-include, clear (Workflow data to illustrate include command \ 2008-04-02) . . // recode two digit missing values . . foreach varname in inneighb insocial inchild infriend { 2. clonevar `varname'R = `varname' 3. replace `varname'R = .a if `varname'R==97 4. replace `varname'R = .b if `varname'R==98 5. replace `varname'R = .c if `varname'R==99 6. tabulate `varname' `varname'R, miss nolabel 7. } (0 real changes made) (10 real changes made, 10 to missing) (1793 real changes made, 1793 to missing) Q13 Would | have X as | Q13 Would have X as neighbor neighbor | 1 2 3 4 .b | Total -----------+-------------------------------------------------------+---------- 1 | 183 0 0 0 0 | 183 2 | 0 212 0 0 0 | 212 3 | 0 0 68 0 0 | 68 4 | 0 0 0 27 0 | 27 98 | 0 0 0 0 10 | 10 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 183 212 68 27 10 | 2,293 | Q13 Would Q13 Would | have X as have X as | neighbor neighbor | .c | Total -----------+-----------+---------- 1 | 0 | 183 2 | 0 | 212 3 | 0 | 68 4 | 0 | 27 98 | 0 | 10 99 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 (0 real changes made) (12 real changes made, 12 to missing) (1793 real changes made, 1793 to missing) Q14 Would | socialize | Q14 Would socialize w X w X | 1 2 3 4 .b | Total -----------+-------------------------------------------------------+---------- 1 | 147 0 0 0 0 | 147 2 | 0 207 0 0 0 | 207 3 | 0 0 92 0 0 | 92 4 | 0 0 0 42 0 | 42 98 | 0 0 0 0 12 | 12 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 147 207 92 42 12 | 2,293 | Q14 Would Q14 Would | socialize socialize | w X w X | .c | Total -----------+-----------+---------- 1 | 0 | 147 2 | 0 | 207 3 | 0 | 92 4 | 0 | 42 98 | 0 | 12 99 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 (1 real change made, 1 to missing) (18 real changes made, 18 to missing) (1793 real changes made, 1793 to missing) Q15 Would | let X care | for | Q15 Would let X care for children children | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 54 0 0 0 0 | 54 2 | 0 82 0 0 0 | 82 3 | 0 0 146 0 0 | 146 4 | 0 0 0 199 0 | 199 97 | 0 0 0 0 1 | 1 98 | 0 0 0 0 0 | 18 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 54 82 146 199 1 | 2,293 Q15 Would | let X care | Q15 Would let X care for | for children children | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 54 2 | 0 0 | 82 3 | 0 0 | 146 4 | 0 0 | 199 97 | 0 0 | 1 98 | 18 0 | 18 99 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 18 1,793 | 2,293 (0 real changes made) (0 real changes made) (0 real changes made) Q16 Would | be friends | Q16 Would be friends w X w X | 1 2 3 4 998 | Total -----------+-------------------------------------------------------+---------- 1 | 161 0 0 0 0 | 161 2 | 0 186 0 0 0 | 186 3 | 0 0 99 0 0 | 99 4 | 0 0 0 41 0 | 41 998 | 0 0 0 0 13 | 13 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 161 186 99 41 13 | 2,293 | Q16 Would Q16 Would | be friends be friends | w X w X | 999 | Total -----------+-----------+---------- 1 | 0 | 161 2 | 0 | 186 3 | 0 | 99 4 | 0 | 41 998 | 0 | 13 999 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . . // recode three digit missing values . . foreach varname in inmarry inwork { 2. clonevar `varname'R = `varname' 3. replace `varname'R = .a if `varname'R==997 4. replace `varname'R = .b if `varname'R==998 5. replace `varname'R = .c if `varname'R==999 6. tabulate `varname' `varname'R, miss nolabel 7. } (2 real changes made, 2 to missing) (43 real changes made, 43 to missing) (1793 real changes made, 1793 to missing) Q18 Would | let X | marry | Q18 Would let X marry relative relative | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 81 0 0 0 0 | 81 2 | 0 134 0 0 0 | 134 3 | 0 0 117 0 0 | 117 4 | 0 0 0 123 0 | 123 997 | 0 0 0 0 2 | 2 998 | 0 0 0 0 0 | 43 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 81 134 117 123 2 | 2,293 Q18 Would | let X | Q18 Would let X marry marry | relative relative | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 81 2 | 0 0 | 134 3 | 0 0 | 117 4 | 0 0 | 123 997 | 0 0 | 2 998 | 43 0 | 43 999 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 43 1,793 | 2,293 (1 real change made, 1 to missing) (14 real changes made, 14 to missing) (1793 real changes made, 1793 to missing) Q17 Would | work w X | Q17 Would work w X on job on job | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 133 0 0 0 0 | 133 2 | 0 201 0 0 0 | 201 3 | 0 0 92 0 0 | 92 4 | 0 0 0 59 0 | 59 997 | 0 0 0 0 1 | 1 998 | 0 0 0 0 0 | 14 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 133 201 92 59 1 | 2,293 Q17 Would | Q17 Would work w X on work w X | job on job | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 133 2 | 0 0 | 201 3 | 0 0 | 92 4 | 0 0 | 59 997 | 0 0 | 1 998 | 14 0 | 14 999 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 14 1,793 | 2,293 . . // #3 . // recode variables with include . . use wf-include, clear (Workflow data to illustrate include command \ 2008-04-02) . . // recode two digit missing values . . local varname inneighb . include wf4-include-2digit-recode.doi . // include: wf4-include-2digit-recode.doi . // used by: wf4-include.do \ for stata 9 . // task: recode 97 98 99 . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // note: code assumes that local varname is defined . // with the name of the variable to be recoded. . . clonevar `varname'R = `varname' . replace `varname'R = .a if `varname'R==97 (0 real changes made) . replace `varname'R = .b if `varname'R==98 (10 real changes made, 10 to missing) . replace `varname'R = .c if `varname'R==99 (1793 real changes made, 1793 to missing) . tabulate `varname' `varname'R, miss nolabel Q13 Would | have X as | Q13 Would have X as neighbor neighbor | 1 2 3 4 .b | Total -----------+-------------------------------------------------------+---------- 1 | 183 0 0 0 0 | 183 2 | 0 212 0 0 0 | 212 3 | 0 0 68 0 0 | 68 4 | 0 0 0 27 0 | 27 98 | 0 0 0 0 10 | 10 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 183 212 68 27 10 | 2,293 | Q13 Would Q13 Would | have X as have X as | neighbor neighbor | .c | Total -----------+-----------+---------- 1 | 0 | 183 2 | 0 | 212 3 | 0 | 68 4 | 0 | 27 98 | 0 | 10 99 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . . local varname insocial . include wf4-include-2digit-recode.doi . // include: wf4-include-2digit-recode.doi . // used by: wf4-include.do \ for stata 9 . // task: recode 97 98 99 . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // note: code assumes that local varname is defined . // with the name of the variable to be recoded. . . clonevar `varname'R = `varname' . replace `varname'R = .a if `varname'R==97 (0 real changes made) . replace `varname'R = .b if `varname'R==98 (12 real changes made, 12 to missing) . replace `varname'R = .c if `varname'R==99 (1793 real changes made, 1793 to missing) . tabulate `varname' `varname'R, miss nolabel Q14 Would | socialize | Q14 Would socialize w X w X | 1 2 3 4 .b | Total -----------+-------------------------------------------------------+---------- 1 | 147 0 0 0 0 | 147 2 | 0 207 0 0 0 | 207 3 | 0 0 92 0 0 | 92 4 | 0 0 0 42 0 | 42 98 | 0 0 0 0 12 | 12 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 147 207 92 42 12 | 2,293 | Q14 Would Q14 Would | socialize socialize | w X w X | .c | Total -----------+-----------+---------- 1 | 0 | 147 2 | 0 | 207 3 | 0 | 92 4 | 0 | 42 98 | 0 | 12 99 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . . local varname inchild . include wf4-include-2digit-recode.doi . // include: wf4-include-2digit-recode.doi . // used by: wf4-include.do \ for stata 9 . // task: recode 97 98 99 . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // note: code assumes that local varname is defined . // with the name of the variable to be recoded. . . clonevar `varname'R = `varname' . replace `varname'R = .a if `varname'R==97 (1 real change made, 1 to missing) . replace `varname'R = .b if `varname'R==98 (18 real changes made, 18 to missing) . replace `varname'R = .c if `varname'R==99 (1793 real changes made, 1793 to missing) . tabulate `varname' `varname'R, miss nolabel Q15 Would | let X care | for | Q15 Would let X care for children children | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 54 0 0 0 0 | 54 2 | 0 82 0 0 0 | 82 3 | 0 0 146 0 0 | 146 4 | 0 0 0 199 0 | 199 97 | 0 0 0 0 1 | 1 98 | 0 0 0 0 0 | 18 99 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 54 82 146 199 1 | 2,293 Q15 Would | let X care | Q15 Would let X care for | for children children | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 54 2 | 0 0 | 82 3 | 0 0 | 146 4 | 0 0 | 199 97 | 0 0 | 1 98 | 18 0 | 18 99 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 18 1,793 | 2,293 . . local varname infriend . include wf4-include-2digit-recode.doi . // include: wf4-include-2digit-recode.doi . // used by: wf4-include.do \ for stata 9 . // task: recode 97 98 99 . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // note: code assumes that local varname is defined . // with the name of the variable to be recoded. . . clonevar `varname'R = `varname' . replace `varname'R = .a if `varname'R==97 (0 real changes made) . replace `varname'R = .b if `varname'R==98 (0 real changes made) . replace `varname'R = .c if `varname'R==99 (0 real changes made) . tabulate `varname' `varname'R, miss nolabel Q16 Would | be friends | Q16 Would be friends w X w X | 1 2 3 4 998 | Total -----------+-------------------------------------------------------+---------- 1 | 161 0 0 0 0 | 161 2 | 0 186 0 0 0 | 186 3 | 0 0 99 0 0 | 99 4 | 0 0 0 41 0 | 41 998 | 0 0 0 0 13 | 13 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 161 186 99 41 13 | 2,293 | Q16 Would Q16 Would | be friends be friends | w X w X | 999 | Total -----------+-----------+---------- 1 | 0 | 161 2 | 0 | 186 3 | 0 | 99 4 | 0 | 41 998 | 0 | 13 999 | 1,793 | 1,793 -----------+-----------+---------- Total | 1,793 | 2,293 . . . // recode three digit missing values . . local varname inmarry . include wf4-include-3digit-recode.doi . // include: wf4-include-3digit-recode.doi . // used by: wf4-include.do \ for stata 9 . // task: recode 97 98 99 . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // note: code assumes that local varname is defined . // with the name of the variable to be recoded. . . clonevar `varname'R = `varname' . replace `varname'R = .a if `varname'R==997 (2 real changes made, 2 to missing) . replace `varname'R = .b if `varname'R==998 (43 real changes made, 43 to missing) . replace `varname'R = .c if `varname'R==999 (1793 real changes made, 1793 to missing) . tabulate `varname' `varname'R, miss nolabel Q18 Would | let X | marry | Q18 Would let X marry relative relative | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 81 0 0 0 0 | 81 2 | 0 134 0 0 0 | 134 3 | 0 0 117 0 0 | 117 4 | 0 0 0 123 0 | 123 997 | 0 0 0 0 2 | 2 998 | 0 0 0 0 0 | 43 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 81 134 117 123 2 | 2,293 Q18 Would | let X | Q18 Would let X marry marry | relative relative | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 81 2 | 0 0 | 134 3 | 0 0 | 117 4 | 0 0 | 123 997 | 0 0 | 2 998 | 43 0 | 43 999 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 43 1,793 | 2,293 . . local varname inwork . include wf4-include-3digit-recode.doi . // include: wf4-include-3digit-recode.doi . // used by: wf4-include.do \ for stata 9 . // task: recode 97 98 99 . // project: workflow chapter 4 . // author: scott long \ 2008-10-24 . . // note: code assumes that local varname is defined . // with the name of the variable to be recoded. . . clonevar `varname'R = `varname' . replace `varname'R = .a if `varname'R==997 (1 real change made, 1 to missing) . replace `varname'R = .b if `varname'R==998 (14 real changes made, 14 to missing) . replace `varname'R = .c if `varname'R==999 (1793 real changes made, 1793 to missing) . tabulate `varname' `varname'R, miss nolabel Q17 Would | work w X | Q17 Would work w X on job on job | 1 2 3 4 .a | Total -----------+-------------------------------------------------------+---------- 1 | 133 0 0 0 0 | 133 2 | 0 201 0 0 0 | 201 3 | 0 0 92 0 0 | 92 4 | 0 0 0 59 0 | 59 997 | 0 0 0 0 1 | 1 998 | 0 0 0 0 0 | 14 999 | 0 0 0 0 0 | 1,793 -----------+-------------------------------------------------------+---------- Total | 133 201 92 59 1 | 2,293 Q17 Would | Q17 Would work w X on work w X | job on job | .b .c | Total -----------+----------------------+---------- 1 | 0 0 | 133 2 | 0 0 | 201 3 | 0 0 | 92 4 | 0 0 | 59 997 | 0 0 | 1 998 | 14 0 | 14 999 | 0 1,793 | 1,793 -----------+----------------------+---------- Total | 14 1,793 | 2,293 . . . log close log: D:\wf\work\wf4-include.log log type: text closed on: 24 Oct 2008, 09:41:06 -------------------------------------------------------------------------------- . exit end of do-file . . log close master log: D:\wf\work\wf4.log log type: text closed on: 24 Oct 2008, 09:41:06 -------------------------------------------------------------------------------- . exit end of do-file . do wf5.do . capture log close master . log using wf5, name(master) replace text (note: file D:\wf\work\wf5.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5.log log type: text opened on: 24 Oct 2008, 09:41:06 . . // program: wf5.do \ for stata 9 . // task: run all do-files in the order they appear . // project: workflow - chapter 5 . // author: scott long \ 2008-10-24 . . * data signatures . do wf5-datasignature.do, nostop . capture log close . log using wf5-datasignature, replace text (note: file D:\wf\work\wf5-datasignature.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5-datasignature.log log type: text opened on: 24 Oct 2008, 09:41:06 . . // program: wf5-datasignature.do \ for stata 9 . // task: using datasignature . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . // note: The datasignature command in Stata 9 is was undocumented and . // difficult to use effectively for data management. The revised . // datasignature command in Stata 10 is an essential part of an . // effective workflow. For details, see the Workflow book. . . // note: This example only works in Stata 10. . . log close log: D:\wf\work\wf5-datasignature.log log type: text closed on: 24 Oct 2008, 09:41:06 -------------------------------------------------------------------------------- . exit end of do-file . . * variables . do wf5-varnames.do . capture log close . log using wf5-varnames, text replace (note: file D:\wf\work\wf5-varnames.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5-varnames.log log type: text opened on: 24 Oct 2008, 09:41:06 . . // program: wf5-varnames.do \ for stata 9 . // task: naming variables . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // if something is new, give it a new name . . * do NOT do it this way . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . sum var27 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- var27 | 753 35.66993 26.44393 1.282051 250 . replace var27 = 100 if var27>100 & var27<. (16 real changes made) . sum var27 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- var27 | 753 34.30911 19.31765 1.282051 100 . . . // #2 . // cloning versus generating variables . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . generate lfp_gen = lfp (327 missing values generated) . clonevar lfp_clone = lfp (327 missing values generated) . codebook lfp*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- lfp 753 2 .5683931 0 1 Paid labor force? lfp_gen 753 2 .5683931 0 1 lfp_clone 753 2 .5683931 0 1 Paid labor force? -------------------------------------------------------------------------------- . describe lfp* storage display value variable name type format label variable label ------------------------------------------------------------------------------- lfp byte %9.0g lfp Paid labor force? lfp_gen float %9.0g lfp_clone byte %9.0g lfp Paid labor force? . . // #3 . // lookfor . . lookfor race storage display value variable name type format label variable label ------------------------------------------------------------------------------- racewhite byte %9.0g Lyn Is white? raceblack byte %9.0g Lyn Is black? raceasian byte %9.0g Lyn Is asian? . . // #4 . // recoding a variable by creating a new variable . . * do it this way . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . gen var27trunc = var27 (327 missing values generated) . replace var27trunc = 100 if var27trunc>100 & var27trunc<. (16 real changes made) . . * or this way . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . clonevar var27trunc = var27 (327 missing values generated) . replace var27trunc = 100 if var27trunc>100 & var27trunc<. (16 real changes made) . . * recoding a missing value . clonevar educV2 = educ (327 missing values generated) . replace educV2 = . if educV2==99 (5 real changes made, 5 to missing) . . // #5 . // leading 0's are ignored with aorder . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . keep vs* . aorder . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- vs1 753 2 .2815405 0 1 Variable 1 vs01 753 2 .2815405 0 1 Variable 01 vs2 753 2 .2815405 0 1 Variable 2 vs02 753 2 .2815405 0 1 Variable 02 vs10 753 2 .2815405 0 1 Variable 10 vs11 753 2 .2815405 0 1 Variable 11 -------------------------------------------------------------------------------- . . // #6 . // use simple, unambiguous names . . * long names . clear . set obs 100 obs was 0, now 100 . set seed 20070323 . generate a2345678901234567890123456789012 = uniform() // renamed runiform() i > n stata 10 . label var a2345678901234567890123456789012 "Long name 1." . generate a23456789012345678901234567890_1 = uniform() // renamed runiform() i > n stata 10 . label var a23456789012345678901234567890_1 "Long name 2." . generate a23456789012345678901234567890_2 = uniform() // renamed runiform() i > n stata 10 . label var a23456789012345678901234567890_2 "Long name 3." . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- a23456789~12 | 100 .4718318 .2695077 .0118152 .9889972 a234567890~1 | 100 .4994476 .2749245 .0068972 .9929506 a23456789~_2 | 100 .4973259 .3026792 .0075843 .9889733 . describe Contains data obs: 100 vars: 3 size: 1,600 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- a23456789012~12 float %9.0g Long name 1. a234567890123~1 float %9.0g Long name 2. a23456789012~_2 float %9.0g Long name 3. ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved . . * changing long names . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . rename socialdistance socdist . label var socdist "socialdistance-Social distance from a person with MI." . describe socdist storage display value variable name type format label variable label ------------------------------------------------------------------------------- socdist byte %9.0g socialdistance-Social distance from a person with MI. . . // #7 . // be careful with capitalization . . summarize ed Ed ED Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ed | 753 11.62815 6.768973 0 23 Ed | 753 11.60691 6.713717 0 23 ED | 753 11.60027 6.849122 0 23 . . log close log: D:\wf\work\wf5-varnames.log log type: text closed on: 24 Oct 2008, 09:41:06 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-varlabels.do . capture log close . log using wf5-varlabels, replace text (note: file D:\wf\work\wf5-varlabels.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5-varlabels.log log type: text opened on: 24 Oct 2008, 09:41:06 . . // program: wf5-varlabels.do \ for stata 9 . // task: labelling variables . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . local date "2008-10-24" . local tag "wf5-varlabels.do \ jsl `date'." . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . . // #2 . // simple example - add variable label to new variable . . generate artsqrt = sqrt(pub1) (772 missing values generated) . label var artsqrt "Square root of # of articles" . . // #3 . // variable lists . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . . * codebook command . codebook id tc1fam tc2fam tc3fam vignum, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id 1080 1080 540.5 1 1080 Identification number tc1fam 1074 10 8.755121 1 10 Q43 How important is it to turn ... tc2fam 1074 10 8.755121 1 10 Q43 How Impt: Turn to family for... tc3fam 1074 10 8.755121 1 10 Q43 Family help important vignum 1080 12 6.187963 1 12 Vignette number -------------------------------------------------------------------------------- . . * describe command . describe id tc1fam tc2fam tc3fam vignum storage display value variable name type format label variable label ------------------------------------------------------------------------------- id int %9.0g Identification number tc1fam byte %21.0g Ltenpt Q43 How important is it to turn to family for help tc2fam byte %21.0g Ltenpt Q43 How Impt: Turn to family for help tc3fam byte %21.0g Ltenpt Q43 Family help important vignum byte %35.0g vignum * Vignette number . describe, simple id tc1fam tc2mhprof Ed var14 vignum tc2fam tc3mhprof ED var15 female tc3fam phd age_verified var16 serious tc1friend pub1 var1 var17 opnoth tc2friend pub3 var2 var18 opfam tc3friend pub6 var3 var19 opfriend tc1relig pub9 var4 k5 oprelig tc2relig var27 var5 wc_v1 opdoc tc3relig educ var6 wc_v2 sdchild tc1doc vs1 var7 wc_v3 sdneighb tc2doc vs2 var8 lfp sdsocial tc3doc vs10 var9 socialdist~e sdfriend tc1psy vs01 var10 racewhite sdwork tc2psy vs02 var11 raceblack sdmarry tc3psy vs11 var12 raceasian tcfam tc1mhprof ed var13 . describe id-opdoc, simple id female opnoth opfriend opdoc vignum serious opfam oprelig . . * nmlab command . nmlab id tc1fam tc2fam tc3fam vignum id Identification number tc1fam Q43 How important is it to turn to family for help tc2fam Q43 How Impt: Turn to family for help tc3fam Q43 Family help important vignum Vignette number . nmlab id tc1fam tc2fam tc3fam vignum, number 1. id Identification number 2. tc1fam Q43 How important is it to turn to family for help 3. tc2fam Q43 How Impt: Turn to family for help 4. tc3fam Q43 Family help important 5. vignum Vignette number . nmlab id tc1fam tc2fam tc3fam vignum, number col(20) 1. id Identification number 2. tc1fam Q43 How important is it to turn to family for help 3. tc2fam Q43 How Impt: Turn to family for help 4. tc3fam Q43 Family help important 5. vignum Vignette number . . // #2 . // order and aorder commands . . * before ordering . nmlab id Identification number vignum Vignette number female R is female? serious Q01 How serious is Xs problem opnoth Q02_00 X do nothing opfam Q02_01 X talk to family opfriend Q02_02 X talk to friends oprelig Q02_03 X talk to relig leader opdoc Q02_04 X see medical doctor sdchild Q15 Would let X care for children sdneighb Q13 Would have X as neighbor sdsocial Q14 Would socialize w X sdfriend Q16 Would be friends w X sdwork Q17 Would work w X on job sdmarry Q18 Would let X marry relative tcfam Q43 How Impt: Turn to family for help tc1fam Q43 How important is it to turn to family for help tc2fam Q43 How Impt: Turn to family for help tc3fam Q43 Family help important tc1friend Q44 How important is it to turn to friends for help tc2friend Q44 How Impt: Turn to friends for help tc3friend Q44 Friends help important tc1relig Q45 How important is it to turn to a minister, priest, rabbi or > other religious tc2relig Q45 How Impt: Turn to a religious leader tc3relig Q45 Relig leader help important tc1doc Q46 How important is it to go to a general medical doctor for he > lp tc2doc Q46 How Impt: Go to a gen med doctor for help tc3doc Q46 Med doctor help important tc1psy Q47 How important is it to go to a psychiatrist for help tc2psy Q47 How Impt: Go to a psych for Help tc3psy Q47 Psychiatric help important tc1mhprof Q48 How important is it to go to a mental health professional tc2mhprof Q48 How Impt: Go to a mental health prof tc3mhprof Q48 MH prof help important phd Prestige of Ph.D. department pub1 Publications: PhD yr -1 to 1 pub3 Publications: PhD yr 1 to 3 pub6 Publications: PhD yr 4 to 6 pub9 Publications: PhD yr 7 to 9 var27 Naming variables - truncation educ Naming variables - replacing values vs1 Variable 1 vs2 Variable 2 vs10 Variable 10 vs01 Variable 01 vs02 Variable 02 vs11 Variable 11 ed Lower case education Ed Sentence case education ED Upper case education age_verified Wife's age in years var1 Random variable 1 var2 Random variable 2 var3 Random variable 3 var4 Random variable 4 var5 Random variable 5 var6 Random variable 6 var7 Random variable 7 var8 Random variable 8 var9 Random variable 9 var10 Random variable 10 var11 Random variable 11 var12 Random variable 12 var13 Random variable 13 var14 Random variable 14 var15 Random variable 15 var16 Random variable 16 var17 Random variable 17 var18 Random variable 18 var19 Random variable 19 k5 # of children younger than 6 wc_v1 Did wife attend college? wc_v2 Did wife attend college? wc_v3 Did wife attend college? lfp Paid labor force? socialdistance Social distance from a person with MI racewhite Is white? raceblack Is black? raceasian Is asian? . . * after ordering . aorder . order id . nmlab id Identification number ED Upper case education Ed Sentence case education age_verified Wife's age in years ed Lower case education educ Naming variables - replacing values female R is female? k5 # of children younger than 6 lfp Paid labor force? opdoc Q02_04 X see medical doctor opfam Q02_01 X talk to family opfriend Q02_02 X talk to friends opnoth Q02_00 X do nothing oprelig Q02_03 X talk to relig leader phd Prestige of Ph.D. department pub1 Publications: PhD yr -1 to 1 pub3 Publications: PhD yr 1 to 3 pub6 Publications: PhD yr 4 to 6 pub9 Publications: PhD yr 7 to 9 raceasian Is asian? raceblack Is black? racewhite Is white? sdchild Q15 Would let X care for children sdfriend Q16 Would be friends w X sdmarry Q18 Would let X marry relative sdneighb Q13 Would have X as neighbor sdsocial Q14 Would socialize w X sdwork Q17 Would work w X on job serious Q01 How serious is Xs problem socialdistance Social distance from a person with MI tc1doc Q46 How important is it to go to a general medical doctor for he > lp tc1fam Q43 How important is it to turn to family for help tc1friend Q44 How important is it to turn to friends for help tc1mhprof Q48 How important is it to go to a mental health professional tc1psy Q47 How important is it to go to a psychiatrist for help tc1relig Q45 How important is it to turn to a minister, priest, rabbi or > other religious tc2doc Q46 How Impt: Go to a gen med doctor for help tc2fam Q43 How Impt: Turn to family for help tc2friend Q44 How Impt: Turn to friends for help tc2mhprof Q48 How Impt: Go to a mental health prof tc2psy Q47 How Impt: Go to a psych for Help tc2relig Q45 How Impt: Turn to a religious leader tc3doc Q46 Med doctor help important tc3fam Q43 Family help important tc3friend Q44 Friends help important tc3mhprof Q48 MH prof help important tc3psy Q47 Psychiatric help important tc3relig Q45 Relig leader help important tcfam Q43 How Impt: Turn to family for help var1 Random variable 1 var2 Random variable 2 var3 Random variable 3 var4 Random variable 4 var5 Random variable 5 var6 Random variable 6 var7 Random variable 7 var8 Random variable 8 var9 Random variable 9 var10 Random variable 10 var11 Random variable 11 var12 Random variable 12 var13 Random variable 13 var14 Random variable 14 var15 Random variable 15 var16 Random variable 16 var17 Random variable 17 var18 Random variable 18 var19 Random variable 19 var27 Naming variables - truncation vignum Vignette number vs1 Variable 1 vs01 Variable 01 vs2 Variable 2 vs02 Variable 02 vs10 Variable 10 vs11 Variable 11 wc_v1 Did wife attend college? wc_v2 Did wife attend college? wc_v3 Did wife attend college? . . // #3 . // truncated variable labels . . * labels are too long with critical information at the end . set linesize 140 // the entire label will appear but run off the page . nmlab tc1* tc1doc Q46 How important is it to go to a general medical doctor for help tc1fam Q43 How important is it to turn to family for help tc1friend Q44 How important is it to turn to friends for help tc1mhprof Q48 How important is it to go to a mental health professional tc1psy Q47 How important is it to go to a psychiatrist for help tc1relig Q45 How important is it to turn to a minister, priest, rabbi or other religious . set linesize 80 // only 80 columns will be shown . codebook tc1*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- tc1doc 1074 10 8.714153 1 10 Q46 How important is it to go to ... tc1fam 1074 10 8.755121 1 10 Q43 How important is it to turn t... tc1friend 1073 10 7.799627 1 10 Q44 How important is it to turn t... tc1mhprof 1045 10 7.58756 1 10 Q48 How important is it to go to ... tc1psy 1050 10 7.567619 1 10 Q47 How important is it to go to ... tc1relig 1039 10 5.66025 1 10 Q45 How important is it to turn t... -------------------------------------------------------------------------------- . . * better labels . nmlab tc2* tc2doc Q46 How Impt: Go to a gen med doctor for help tc2fam Q43 How Impt: Turn to family for help tc2friend Q44 How Impt: Turn to friends for help tc2mhprof Q48 How Impt: Go to a mental health prof tc2psy Q47 How Impt: Go to a psych for Help tc2relig Q45 How Impt: Turn to a religious leader . codebook tc2*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- tc2doc 1074 10 8.714153 1 10 Q46 How Impt: Go to a gen med doc... tc2fam 1074 10 8.755121 1 10 Q43 How Impt: Turn to family for ... tc2friend 1073 10 7.799627 1 10 Q44 How Impt: Turn to friends for... tc2mhprof 1045 10 7.58756 1 10 Q48 How Impt: Go to a mental heal... tc2psy 1050 10 7.567619 1 10 Q47 How Impt: Go to a psych for Help tc2relig 1039 10 5.66025 1 10 Q45 How Impt: Turn to a religious... -------------------------------------------------------------------------------- . . * labels we used . nmlab tc3* tc3doc Q46 Med doctor help important tc3fam Q43 Family help important tc3friend Q44 Friends help important tc3mhprof Q48 MH prof help important tc3psy Q47 Psychiatric help important tc3relig Q45 Relig leader help important . codebook tc3*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- tc3doc 1074 10 8.714153 1 10 Q46 Med doctor help important tc3fam 1074 10 8.755121 1 10 Q43 Family help important tc3friend 1073 10 7.799627 1 10 Q44 Friends help important tc3mhprof 1045 10 7.58756 1 10 Q48 MH prof help important tc3psy 1050 10 7.567619 1 10 Q47 Psychiatric help important tc3relig 1039 10 5.66025 1 10 Q45 Relig leader help important -------------------------------------------------------------------------------- . . * what to include? . generate tcfamsqrt = sqrt(tcfam) (6 missing values generated) . label var tcfamsqrt /// > "Q43 Sqrt family help important? \ `tag'" . tabulate tcfamsqrt, missing Q43 Sqrt | family help | important? | \ | Freq. Percent Cum. ------------+----------------------------------- 1 | 9 0.83 0.83 1.414214 | 4 0.37 1.20 1.732051 | 11 1.02 2.22 2 | 13 1.20 3.43 2.236068 | 53 4.91 8.33 2.44949 | 51 4.72 13.06 2.645751 | 75 6.94 20.00 2.828427 | 139 12.87 32.87 3 | 97 8.98 41.85 3.162278 | 622 57.59 99.44 . | 6 0.56 100.00 ------------+----------------------------------- Total | 1,080 100.00 . . * checking labels . codebook tc3*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- tc3doc 1074 10 8.714153 1 10 Q46 Med doctor help important tc3fam 1074 10 8.755121 1 10 Q43 Family help important tc3friend 1073 10 7.799627 1 10 Q44 Friends help important tc3mhprof 1045 10 7.58756 1 10 Q48 MH prof help important tc3psy 1050 10 7.567619 1 10 Q47 Psychiatric help important tc3relig 1039 10 5.66025 1 10 Q45 Relig leader help important -------------------------------------------------------------------------------- . tabulate tcfam, missing Q43 How Impt: | Turn to family | for help | Freq. Percent Cum. -----------------+----------------------------------- 1Not at all Impt | 9 0.83 0.83 2 | 4 0.37 1.20 3 | 11 1.02 2.22 4 | 13 1.20 3.43 5 | 53 4.91 8.33 6 | 51 4.72 13.06 7 | 75 6.94 20.00 8 | 139 12.87 32.87 9 | 97 8.98 41.85 10Vry Impt | 622 57.59 99.44 .c_Dont know | 6 0.56 100.00 -----------------+----------------------------------- Total | 1,080 100.00 . . * example of a label that is too long . clonevar tcfamV2 = tcfam (6 missing values generated) . label var tcfamV2 /// > " Question 43: How important is it to you to turn to the family for suppor > t?" . tabulate tcfamV2, missing Question 43: How | important is it | to you to turn | to the family | for support? | Freq. Percent Cum. -----------------+----------------------------------- 1Not at all Impt | 9 0.83 0.83 2 | 4 0.37 1.20 3 | 11 1.02 2.22 4 | 13 1.20 3.43 5 | 53 4.91 8.33 6 | 51 4.72 13.06 7 | 75 6.94 20.00 8 | 139 12.87 32.87 9 | 97 8.98 41.85 10Vry Impt | 622 57.59 99.44 .c_Dont know | 6 0.56 100.00 -----------------+----------------------------------- Total | 1,080 100.00 . . // #4 . // temporarily changing variable labels . . * tabulate with the original labels . foreach varname in pub1 pub3 pub6 pub9 { 2. tabulate `varname', missing 3. } Publication | s: PhD yr | -1 to 1 | Freq. Percent Cum. ------------+----------------------------------- 0 | 77 7.13 7.13 1 | 75 6.94 14.07 2 | 36 3.33 17.41 3 | 37 3.43 20.83 4 | 29 2.69 23.52 5 | 18 1.67 25.19 6 | 12 1.11 26.30 7 | 8 0.74 27.04 8 | 3 0.28 27.31 9 | 3 0.28 27.59 10 | 4 0.37 27.96 11 | 1 0.09 28.06 13 | 1 0.09 28.15 16 | 1 0.09 28.24 18 | 1 0.09 28.33 19 | 1 0.09 28.43 24 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 Publication | s: PhD yr 1 | to 3 | Freq. Percent Cum. ------------+----------------------------------- 0 | 63 5.83 5.83 1 | 61 5.65 11.48 2 | 50 4.63 16.11 3 | 30 2.78 18.89 4 | 27 2.50 21.39 5 | 26 2.41 23.80 6 | 9 0.83 24.63 7 | 13 1.20 25.83 8 | 10 0.93 26.76 9 | 3 0.28 27.04 10 | 6 0.56 27.59 11 | 1 0.09 27.69 12 | 2 0.19 27.87 13 | 1 0.09 27.96 15 | 2 0.19 28.15 16 | 1 0.09 28.24 27 | 1 0.09 28.33 28 | 1 0.09 28.43 31 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 Publication | s: PhD yr 4 | to 6 | Freq. Percent Cum. ------------+----------------------------------- 0 | 59 5.46 5.46 1 | 43 3.98 9.44 2 | 42 3.89 13.33 3 | 33 3.06 16.39 4 | 27 2.50 18.89 5 | 20 1.85 20.74 6 | 26 2.41 23.15 7 | 8 0.74 23.89 8 | 13 1.20 25.09 9 | 8 0.74 25.83 10 | 5 0.46 26.30 11 | 3 0.28 26.57 12 | 3 0.28 26.85 13 | 1 0.09 26.94 14 | 2 0.19 27.13 15 | 1 0.09 27.22 16 | 1 0.09 27.31 17 | 3 0.28 27.59 18 | 2 0.19 27.78 19 | 1 0.09 27.87 20 | 1 0.09 27.96 21 | 1 0.09 28.06 22 | 1 0.09 28.15 23 | 2 0.19 28.33 26 | 1 0.09 28.43 29 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 Publication | s: PhD yr 7 | to 9 | Freq. Percent Cum. ------------+----------------------------------- 0 | 61 5.65 5.65 1 | 31 2.87 8.52 2 | 42 3.89 12.41 3 | 36 3.33 15.74 4 | 31 2.87 18.61 5 | 21 1.94 20.56 6 | 18 1.67 22.22 7 | 15 1.39 23.61 8 | 9 0.83 24.44 9 | 11 1.02 25.46 10 | 2 0.19 25.65 11 | 2 0.19 25.83 12 | 6 0.56 26.39 13 | 6 0.56 26.94 14 | 2 0.19 27.13 15 | 1 0.09 27.22 16 | 2 0.19 27.41 18 | 1 0.09 27.50 19 | 1 0.09 27.59 20 | 2 0.19 27.78 21 | 1 0.09 27.87 23 | 1 0.09 27.96 24 | 1 0.09 28.06 25 | 1 0.09 28.15 27 | 2 0.19 28.33 30 | 1 0.09 28.43 33 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 . . * tabulate after removing the label . foreach varname in pub1 pub3 pub6 pub9 { 2. label var `varname' "" 3. tabulate `varname', missing 4. } pub1 | Freq. Percent Cum. ------------+----------------------------------- 0 | 77 7.13 7.13 1 | 75 6.94 14.07 2 | 36 3.33 17.41 3 | 37 3.43 20.83 4 | 29 2.69 23.52 5 | 18 1.67 25.19 6 | 12 1.11 26.30 7 | 8 0.74 27.04 8 | 3 0.28 27.31 9 | 3 0.28 27.59 10 | 4 0.37 27.96 11 | 1 0.09 28.06 13 | 1 0.09 28.15 16 | 1 0.09 28.24 18 | 1 0.09 28.33 19 | 1 0.09 28.43 24 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 pub3 | Freq. Percent Cum. ------------+----------------------------------- 0 | 63 5.83 5.83 1 | 61 5.65 11.48 2 | 50 4.63 16.11 3 | 30 2.78 18.89 4 | 27 2.50 21.39 5 | 26 2.41 23.80 6 | 9 0.83 24.63 7 | 13 1.20 25.83 8 | 10 0.93 26.76 9 | 3 0.28 27.04 10 | 6 0.56 27.59 11 | 1 0.09 27.69 12 | 2 0.19 27.87 13 | 1 0.09 27.96 15 | 2 0.19 28.15 16 | 1 0.09 28.24 27 | 1 0.09 28.33 28 | 1 0.09 28.43 31 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 pub6 | Freq. Percent Cum. ------------+----------------------------------- 0 | 59 5.46 5.46 1 | 43 3.98 9.44 2 | 42 3.89 13.33 3 | 33 3.06 16.39 4 | 27 2.50 18.89 5 | 20 1.85 20.74 6 | 26 2.41 23.15 7 | 8 0.74 23.89 8 | 13 1.20 25.09 9 | 8 0.74 25.83 10 | 5 0.46 26.30 11 | 3 0.28 26.57 12 | 3 0.28 26.85 13 | 1 0.09 26.94 14 | 2 0.19 27.13 15 | 1 0.09 27.22 16 | 1 0.09 27.31 17 | 3 0.28 27.59 18 | 2 0.19 27.78 19 | 1 0.09 27.87 20 | 1 0.09 27.96 21 | 1 0.09 28.06 22 | 1 0.09 28.15 23 | 2 0.19 28.33 26 | 1 0.09 28.43 29 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 pub9 | Freq. Percent Cum. ------------+----------------------------------- 0 | 61 5.65 5.65 1 | 31 2.87 8.52 2 | 42 3.89 12.41 3 | 36 3.33 15.74 4 | 31 2.87 18.61 5 | 21 1.94 20.56 6 | 18 1.67 22.22 7 | 15 1.39 23.61 8 | 9 0.83 24.44 9 | 11 1.02 25.46 10 | 2 0.19 25.65 11 | 2 0.19 25.83 12 | 6 0.56 26.39 13 | 6 0.56 26.94 14 | 2 0.19 27.13 15 | 1 0.09 27.22 16 | 2 0.19 27.41 18 | 1 0.09 27.50 19 | 1 0.09 27.59 20 | 2 0.19 27.78 21 | 1 0.09 27.87 23 | 1 0.09 27.96 24 | 1 0.09 28.06 25 | 1 0.09 28.15 27 | 2 0.19 28.33 30 | 1 0.09 28.43 33 | 1 0.09 28.52 . | 772 71.48 100.00 ------------+----------------------------------- Total | 1,080 100.00 . . * labels in graphs . scatter phd pub1 . graph export wf5-varlabels-original.eps, replace (note: file wf5-varlabels-original.eps not found) (file wf5-varlabels-original.eps written in EPS format) . . * change the labels for the axes . label var pub1 "Articles at time of Ph.D." . label var phd "Ph.D. Prestige" . scatter phd pub1 . graph export wf5-varlabels-revised.eps, replace (note: file wf5-varlabels-revised.eps not found) (file wf5-varlabels-revised.eps written in EPS format) . . log close log: D:\wf\work\wf5-varlabels.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-varname-to-label.do . capture log close . log using wf5-varname-to-label, replace text (note: file D:\wf\work\wf5-varname-to-label.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5-varname-to-label.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-varname-to-label.do \ for stata 9 . // task: add the variable name to the variable label . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . . // #2 . // check names and labels . . nmlab lfp In paid labor force? 1=yes 0=no k5 # kids < 6 k618 # kids 6-18 age Wife's age in years wc Wife attended college? 1=yes 0=no hc Husband attended college? 1=yes 0=no lwg Log of wife's estimated wages inc Family income excluding wife's . tabulate wc hc, missing Wife | attended | Husband attended college? | college? 1=yes 0=no 1=yes 0=no | 0_NoCol 1_College | Total -----------+----------------------+---------- 0_NoCol | 417 124 | 541 1_College | 41 171 | 212 -----------+----------------------+---------- Total | 458 295 | 753 . . // #3 . // loop through variables and add names to labels . . unab varlist : _all . display "varlist is: `varlist'" varlist is: lfp k5 k618 age wc hc lwg inc . . foreach varname in `varlist' { 2. local varlabel : variable label `varname' 3. label var `varname' "`varname': `varlabel'" 4. } . . // #4 . // check the results . . nmlab lfp lfp: In paid labor force? 1=yes 0=no k5 k5: # kids < 6 k618 k618: # kids 6-18 age age: Wife's age in years wc wc: Wife attended college? 1=yes 0=no hc hc: Husband attended college? 1=yes 0=no lwg lwg: Log of wife's estimated wages inc inc: Family income excluding wife's . tabulate wc hc, missing wc: Wife | attended | hc: Husband attended college? | college? 1=yes 0=no 1=yes 0=no | 0_NoCol 1_College | Total -----------+----------------------+---------- 0_NoCol | 417 124 | 541 1_College | 41 171 | 212 -----------+----------------------+---------- Total | 458 295 | 753 . . log close log: D:\wf\work\wf5-varname-to-label.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-varnotes.do . capture log close . log using wf5-varnotes, replace text (note: file D:\wf\work\wf5-varnotes.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5-varnotes.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-varnotes.do \ for stata 9 . // task: adding notes to varaibles . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . . // #2 . // example of notes . . generate pub9trunc = pub9 (772 missing values generated) . replace pub9trunc = 20 if pub9trunc>20 & !missing(pub9trunc) (8 real changes made) . label var pub9trunc "Pub 9 truncated at 20: PhD yr 7 to 9" . note pub9trunc: pubs>20 recoded to 20 \ wf5-varnotes.do jsl 2008-10-24. . note pub9trunc pub9trunc: 1. pubs>20 recoded to 20 \ wf5-varnotes.do jsl 2008-10-24. . . // #3 . // long notes . . note pub9trunc: Earlier analyses (pubreg04a.do 2006-09-20) showed /// > that cases with a large number of articles were outliers. Program /// > pubreg04b.do 2006-09-21 examined different transformations of pub9 /// > and found that truncation at 20 was most effective at removing /// > the outliers. \ jsl 2008-10-24. . note pub9trunc pub9trunc: 1. pubs>20 recoded to 20 \ wf5-varnotes.do jsl 2008-10-24. 2. Earlier analyses (pubreg04a.do 2006-09-20) showed that cases with a large number of articles were outliers. Program pubreg04b.do 2006-09-21 examined different transformations of pub9 and found that truncation at 20 was most effective at removing the outliers. \ jsl 2008-10-24. . . // #4 . // using TS | | . // You need spaces here V V . . note pub9trunc: pub9 truncated at 20 \ wf5-varnotes.do jsl TS . . note pub9trunc pub9trunc: 1. pubs>20 recoded to 20 \ wf5-varnotes.do jsl 2008-10-24. 2. Earlier analyses (pubreg04a.do 2006-09-20) showed that cases with a large number of articles were outliers. Program pubreg04b.do 2006-09-21 examined different transformations of pub9 and found that truncation at 20 was most effective at removing the outliers. \ jsl 2008-10-24. 3. pub9 truncated at 20 \ wf5-varnotes.do jsl 24 Oct 2008 09:41 . . . // #5 . // listing selected notes . . note list vignum in 2/3 vignum: 2. BGR - majority vs. minority = bulgarian vs. turk 3. ESP - majority vs. minority = spaniard vs. gypsy . . // #6 . // dropping notes . . notes drop vignum in 2/3 // drop some notes (2 notes dropped) . notes drop vignum // drop all notes (3 notes dropped) . . // #7 . // using tags for notes and listing notes with codebook . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . . * create the variables for the example . foreach varname in pub1 pub3 pub6 pub9 { 2. clonevar `varname'trunc = `varname' 3. replace `varname'trunc = 20 if `varname'trunc>20 /// > & !missing(`varname'trunc) 4. } (772 missing values generated) (1 real change made) (772 missing values generated) (3 real changes made) (772 missing values generated) (6 real changes made) (772 missing values generated) (8 real changes made) . . * add notes using a local tab . local tag "pub# truncated at 20 \ wf5-varnotes.do jsl 2008-10-24." . note pub1trunc: `tag' . note pub3trunc: `tag' . note pub6trunc: `tag' . note pub9trunc: `tag' . note pub* pub1trunc: 1. pub# truncated at 20 \ wf5-varnotes.do jsl 2008-10-24. pub3trunc: 1. pub# truncated at 20 \ wf5-varnotes.do jsl 2008-10-24. pub6trunc: 1. pub# truncated at 20 \ wf5-varnotes.do jsl 2008-10-24. pub9trunc: 1. pub# truncated at 20 \ wf5-varnotes.do jsl 2008-10-24. . . * codebook . codebook pub1trunc, notes -------------------------------------------------------------------------------- pub1trunc Publications: PhD yr -1 to 1 -------------------------------------------------------------------------------- type: numeric (byte) range: [0,20] units: 1 unique values: 17 missing .: 772/1080 mean: 2.53247 std. dev: 3.00958 percentiles: 10% 25% 50% 75% 90% 0 .5 2 4 6 pub1trunc: 1. pub# truncated at 20 \ wf5-varnotes.do jsl 2008-10-24. . . // #9 . // notes in loops . . use wf-names, clear (Workflow data to illustrate names \ 2008-04-03) . local tag "wf5-varnotes.do jsl 2008-10-24." . . foreach varname in pub1 pub3 pub6 pub9 { 2. clonevar `varname'trunc = `varname' 3. replace `varname'trunc = 20 if `varname'trunc>20 /// > & !missing(`varname'trunc) 4. label var `varname'trunc "`varname' truncated at 20" 5. note `varname'trunc: `varname' truncated at 20 \ `tag' 6. } (772 missing values generated) (1 real change made) (772 missing values generated) (3 real changes made) (772 missing values generated) (6 real changes made) (772 missing values generated) (8 real changes made) . . log close log: D:\wf\work\wf5-varnotes.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . . * values . * do wf5-vallabels.do . . * language . do wf5-language.do . capture log close . log using wf5-language, replace text (note: file D:\wf\work\wf5-language.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf5-language.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-language.do \ for stata 9 . // task: multiple languages . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // managing languages . . use wf-languages-spoken, clear (Workflow data with spoken languages \ 2008-04-03) . label language Language for variable and value labels Available languages: default english french spanish Currently set is: . label language spanish To select different language: . label language To create new language: . label language , new To rename current language: . label language , rename . label language english . tabulate male, missing Gender of | respondent | Freq. Percent Cum. ------------+----------------------------------- 0_Women | 1,227 53.51 53.51 1_Men | 1,066 46.49 100.00 ------------+----------------------------------- Total | 2,293 100.00 . label language french . tabulate male, missing Genre de | répondant | Freq. Percent Cum. ------------+----------------------------------- 0_Femmes | 1,227 53.51 53.51 1_Hommes | 1,066 46.49 100.00 ------------+----------------------------------- Total | 2,293 100.00 . label language spanish . tabulate male, missing Género del | respondedor | Freq. Percent Cum. ------------+----------------------------------- 0_Mujeres | 1,227 53.51 53.51 1_Hombres | 1,066 46.49 100.00 ------------+----------------------------------- Total | 2,293 100.00 . . // #2 . // add new languages . . * english french spanish . * Men Hommes Hombres . * Women Femmes Mujeres . . use wf-languages-single, clear (Workflow data with single language \ 2008-04-03) . . * english . label language english, new (language english now current language) . label define male 0 "0_Women" 1 "1_Men" . label val male male . label var male "Gender of respondent" . * french . label language french, new (language french now current language) . label define male_fr 0 "0_Femmes" 1 "1_Hommes" . label val male male_fr . label var male "Genre de répondant" . * spanish . label language spanish, new (language spanish now current language) . label define male_es 0 "0_Mujeres" 1 "1_Hombres" . label val male male_es . label var male "Género del respondedor" . . // #4 . // shorter and long labels // source and analysis languages . . use wf-languages-analysis, clear (Workflow data with analysis and source labels \ 2008-04-03) . label language source . describe male warm storage display value variable name type format label variable label ------------------------------------------------------------------------------- male byte %10.0g Smale Gender warm byte %17.0g Swarm A working mother can establish just as warm and secure a relationship with her c . tabulate male warm, missing | A working mother can establish just as warm | and secure a relationship with her c Gender | Strongly Agree Disagree Strongly | Total -----------+--------------------------------------------+---------- Female | 139 323 461 304 | 1,227 Male | 158 400 395 113 | 1,066 -----------+--------------------------------------------+---------- Total | 297 723 856 417 | 2,293 . . label language analysis . describe male warm storage display value variable name type format label variable label ------------------------------------------------------------------------------- male byte %10.0g Amale Gender: 1=male 0=female warm byte %17.0g Awarm Mom can have warm relations with child? . tabulate male warm, missing Gender: | 1=male | Mom can have warm relations with child? 0=female | 1_SD 2_D 3_A 4_SA | Total -----------+--------------------------------------------+---------- 0_Women | 139 323 461 304 | 1,227 1_Men | 158 400 395 113 | 1,066 -----------+--------------------------------------------+---------- Total | 297 723 856 417 | 2,293 . . // #5 . // adding short and long labels . . use wf-languages-single, clear (Workflow data with single language \ 2008-04-03) . label language analysis, new (language analysis now current language) . . log close log: D:\wf\work\wf5-language.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . . * SEE ALSO: wf5-master.do and wf5-sgc.do . . log close master log: D:\wf\work\wf5.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-sgc.do . capture log close master . set linesize 80 . log using wf5-sgc, name(master) replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc.do \ for stata 9 . // task: run all steps . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . * check current labels . do wf5-sgc1a-list.do . capture log close . log using wf5-sgc1a-list, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc1a-list.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc1a-list.do \ for stata 9 . // task: step 1a: list current names and labels . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and describe with default linesize . . use wf-sgc-source, clear (Workflow data for SGC renaming example \ 2008-04-03) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. . . // #2 . // create macro with the names of all varaibles in dataset . . unab varlist : _all . display "`varlist'" id_iu cntry_iu vignum serious opfam opfriend tospi tonpm oppme opforg atdisease > atraised atgenes sdlive sdsocial sdchild sdfriend sdwork sdmarry impown imptre > at stout stfriend stlimits stuncom tcfam tcfriend tcdoc gvjob gvhealth gvhous > gvdisben ctxfdoc ctxfmed ctxfhos cause puboften pubfright pubsymp trust gender > age wrkstat marital edudeg . . // #3 . // list names and labels with a loop . . * long line size to prevent wrapping of long labels . set linesize 120 . . * counter to number each variable . local counter = 1 . . * start the loop through all variables . foreach varname in `varlist' { 2. . * retrieve variable label . local varlabel : variable label `varname' 3. * retrieve name of value label . local vallabel : value label `varname' 4. * print the information . display "`counter'." _col(6) "`varname'" _col(19) /// > "`vallabel'" _col(32) "`varlabel'" 5. local ++counter 6. . } 1. id_iu Respondent Number 2. cntry_iu cntry_iu IU Country Number 3. vignum vignum Vignette 4. serious serious Q1 How serious would you consider Xs situation to be? 5. opfam Ldummy Q2_1 What X should do:Talk to family 6. opfriend Ldummy Q2_2 What X should do:Talk to friends 7. tospi Ldummy Q2_7 What X should do:Go to spiritual or traditional healer 8. tonpm Ldummy Q2_8 What X should do:Take non-prescription medication 9. oppme Ldummy Q2_9 What X should do:Take prescription medication 10. opforg Ldummy Q2_14 What X should do:Try to forget about it 11. atdisease Llikely Q4 Xs stuation is caused by: A brain disease or disorder 12. atraised Llikely Q5 Xs stuation is caused by: the way X was raised 13. atgenes Llikely Q7 Xs stuation is caused by: A genetic or inherited problem 14. sdlive Ldist Q13 To have X as a neighbor? 15. sdsocial Ldist Q14 To spend time socializing with X? 16. sdchild Ldist Q15 To have X care for your children or children you know? 17. sdfriend Ldist Q16 To make friends with X? 18. sdwork Ldist Q17 To work closely with X on a job? 19. sdmarry Ldist Q18 To have X marry someone related to you? 20. impown Llikely Q19 How likely is it that Xs situation will improve on its own? 21. imptreat Llikely Q20 How likely is it that Xs situation will improve with treatment? 22. stout Llikert Q23 Getting treatment would make X an outsider in Xs community 23. stfriend Llikert Q24 If X let people know X is in treatment, X would lose some friends 24. stlimits Llikert Q25 No matter how much X achieves, opprtun limit if oth knew X recvd treatment 25. stuncom Llikert Q26 Being around X would make me feel uncomfortable 26. tcfam Limport Q43 How Important: Turn to family for help 27. tcfriend Limport Q44 How Important: Turn to friends for help 28. tcdoc Limport Q46 How Important: Go to a general medical doctor for help 29. gvjob Lrespons Q49 Government Responsibility: Provide a job for X if X wants one 30. gvhealth Lrespons Q50 Government Responsibility: Provide health care for X 31. gvhous Lrespons Q51 Government Responsibility: Provide housing for X if X can not afford it 32. gvdisben Lrespons Q53 Government Responsibility: Provide disability benefits for X 33. ctxfdoc Lrespons Q56 Forced by law to be examined at a clinic or by a doctor? 34. ctxfmed Lrespons Q57 Forced by law to take medication prescribed by a doctor? 35. ctxfhos Lrespons Q58 Forced by law to be hospitalized for treatment? 36. cause cause Q62 Is Xs situation caused by depression, asthma, schizophrenia, stress, other? 37. puboften puboften Q72 How often you see someone w/a serious mental health problem in public place? 38. pubfright pubfright Q73 How frightening are people seen in public who seem to have mental hlth prob? 39. pubsymp pubsymp Q74 How much sympathy for people w/mental hlth problm that you see in public? 40. trust trust Q75 Would you say people can be trusted or need to be careful dealing w/people? 41. gender gender Gender 42. age age Age 43. wrkstat wrkstat Current employment status 44. marital marital Marital status 45. edudeg edudeg Education II-highest education level . . // #4 . // send list to a file for editing . . * open a file that will hold the names and labels . capture file close myfile . file open myfile using wf5-sgc1a-list.txt, write replace . . * write header row with ; delimiters . file write myfile "Number;Name;Value label;Variable labels" _newline . . * counter to number each variable . local counter = 1 . . * start the loop through all variables . foreach varname in `varlist' { 2. . * retrieve current labels . local varlabel : variable label `varname' 3. local vallabel : value label `varname' 4. . * write a ; delimited row of data . file write myfile "`counter';`varname';`vallabel';`varlabel'" _newline 5. . *> for a tab delimited file, you can use this: . *> file write myfile "`counter'" _tab "`varname'" /// > *> _tab "`vallabel'" _tab "`varlabel'" _newline . . local ++counter 6. } . . file close myfile . . log close log: D:\wf\work\wf5-sgc1a-list.log log type: text closed on: 24 Oct 2008, 09:41:08 ------------------------------------------------------------------------------------------------------------------------ . exit end of do-file . do wf5-sgc1b-try.do . capture log close . log using wf5-sgc1b-try, replace text ------------------------------------------------------------------------------------------------------------------------ log: D:\wf\work\wf5-sgc1b-try.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc1b-try.do \ for stata 9 . // task: try current names and labels . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-sgc-source, clear (Workflow data for SGC renaming example \ 2008-04-03) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. . . // #2 . // use codebook to examine names and variable labels . . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id_iu 200 200 1772875 1100107 2601091 Respondent Number cntry_iu 200 8 17.495 11 26 IU Country Number vignum 200 12 6.305 1 12 Vignette serious 196 4 1.709184 1 4 Q1 How serious would you c... opfam 199 2 1.693467 1 2 Q2_1 What X should do:Talk... opfriend 198 2 1.833333 1 2 Q2_2 What X should do:Talk... tospi 197 2 1.954315 1 2 Q2_7 What X should do:Go t... tonpm 198 2 1.969697 1 2 Q2_8 What X should do:Take... oppme 193 2 1.896373 1 2 Q2_9 What X should do:Take... opforg 199 2 1.974874 1 2 Q2_14 What X should do:Try... atdisease 192 4 2.739583 1 4 Q4 Xs stuation is caused b... atraised 194 4 2.948454 1 4 Q5 Xs stuation is caused b... atgenes 186 4 2.435484 1 4 Q7 Xs stuation is caused b... sdlive 195 4 1.758974 1 4 Q13 To have X as a neighbor? sdsocial 192 4 2.088542 1 4 Q14 To spend time socializ... sdchild 191 4 2.979058 1 4 Q15 To have X care for you... sdfriend 187 4 2.037433 1 4 Q16 To make friends with X? sdwork 190 4 2.1 1 4 Q17 To work closely with X... sdmarry 192 4 2.640625 1 4 Q18 To have X marry someon... impown 196 4 3.040816 1 4 Q19 How likely is it that ... imptreat 193 4 1.621762 1 4 Q20 How likely is it that ... stout 167 4 3.197605 1 4 Q23 Getting treatment woul... stfriend 154 4 2.928571 1 4 Q24 If X let people know X... stlimits 151 4 2.748344 1 4 Q25 No matter how much X a... stuncom 154 4 2.967532 1 4 Q26 Being around X would m... tcfam 200 9 8.825 1 10 Q43 How Important: Turn to... tcfriend 199 10 8.055276 1 10 Q44 How Important: Turn to... tcdoc 196 10 8.704082 1 10 Q46 How Important: Go to a... gvjob 196 4 1.693878 1 4 Q49 Government Responsibil... gvhealth 199 4 1.336683 1 4 Q50 Government Responsibil... gvhous 190 4 1.973684 1 4 Q51 Government Responsibil... gvdisben 191 4 1.884817 1 4 Q53 Government Responsibil... ctxfdoc 197 4 2.06599 1 4 Q56 Forced by law to be ex... ctxfmed 193 4 1.756477 1 4 Q57 Forced by law to take ... ctxfhos 187 4 2.203209 1 4 Q58 Forced by law to be ho... cause 162 5 1.993827 1 5 Q62 Is Xs situation caused... puboften 198 4 2.813131 1 4 Q72 How often you see some... pubfright 135 4 3.02963 1 4 Q73 How frightening are pe... pubsymp 136 4 3.066176 1 4 Q74 How much sympathy for ... trust 145 2 1.813793 1 2 Q75 Would you say people c... gender 200 2 1.55 1 2 Gender age 200 62 44.395 18 97 Age wrkstat 196 9 4.112245 1 10 Current employment status marital 199 6 2.547739 1 6 Marital status edudeg 200 6 2.235 0 5 Education II-highest educa... -------------------------------------------------------------------------------- . . // #3 . // use tabulate to examine variable and value labels . . * drop variables that aren't appropriate for tabulate . drop id_iu cntry_iu age . . * get a list of the remaining variables . unab varlist : _all . . * loop through the variables . foreach varname in `varlist' { 2. display "`varname':" 3. tabulate gender `varname', miss 4. } vignum: | Vignette Gender | Depressiv Depressiv Depressiv Depressiv Schizophr | Total -----------+-------------------------------------------------------+---------- Male | 15 11 3 4 7 | 90 Female | 8 12 9 5 13 | 110 -----------+-------------------------------------------------------+---------- Total | 23 23 12 9 20 | 200 | Vignette Gender | Schizophr Schizophr Schizophr Asthma/Ma Asthma/Ma | Total -----------+-------------------------------------------------------+---------- Male | 5 14 7 7 8 | 90 Female | 5 7 9 7 18 | 110 -----------+-------------------------------------------------------+---------- Total | 10 21 16 14 26 | 200 | Vignette Gender | Asthma/Mi Asthma/Mi | Total -----------+----------------------+---------- Male | 5 4 | 90 Female | 8 9 | 110 -----------+----------------------+---------- Total | 13 13 | 200 serious: | Q1 How serious would you consider Xs situation to be? Gender | Very seri Moderatel Not very Not at al .c | Total -----------+-------------------------------------------------------+---------- Male | 42 37 8 2 1 | 90 Female | 49 38 18 2 3 | 110 -----------+-------------------------------------------------------+---------- Total | 91 75 26 4 4 | 200 opfam: | Q2_1 What X should do:Talk to | family Gender | Yes No .c | Total -----------+---------------------------------+---------- Male | 30 59 1 | 90 Female | 31 79 0 | 110 -----------+---------------------------------+---------- Total | 61 138 1 | 200 opfriend: | Q2_2 What X should do:Talk to | friends Gender | Yes No .c | Total -----------+---------------------------------+---------- Male | 16 73 1 | 90 Female | 17 92 1 | 110 -----------+---------------------------------+---------- Total | 33 165 2 | 200 tospi: | Q2_7 What X should do:Go to | spiritual or traditional healer Gender | Yes No .c | Total -----------+---------------------------------+---------- Male | 4 86 0 | 90 Female | 5 102 3 | 110 -----------+---------------------------------+---------- Total | 9 188 3 | 200 tonpm: | Q2_8 What X should do:Take | non-prescription medication Gender | Yes No .c | Total -----------+---------------------------------+---------- Male | 5 85 0 | 90 Female | 1 107 2 | 110 -----------+---------------------------------+---------- Total | 6 192 2 | 200 oppme: | Q2_9 What X should do:Take | prescription medication Gender | Yes No .c | Total -----------+---------------------------------+---------- Male | 9 79 2 | 90 Female | 11 94 5 | 110 -----------+---------------------------------+---------- Total | 20 173 7 | 200 opforg: | Q2_14 What X should do:Try to | forget about it Gender | Yes No .c | Total -----------+---------------------------------+---------- Male | 4 86 0 | 90 Female | 1 108 1 | 110 -----------+---------------------------------+---------- Total | 5 194 1 | 200 atdisease: | Q4 Xs stuation is caused by: A brain disease or | disorder Gender | Very Like Somewhat Not Very Not at al .c | Total -----------+-------------------------------------------------------+---------- Male | 10 27 30 22 0 | 90 Female | 12 30 32 29 7 | 110 -----------+-------------------------------------------------------+---------- Total | 22 57 62 51 7 | 200 | Q4 Xs | stuation | is caused | by: A | brain | disease or | disorder Gender | .d | Total -----------+-----------+---------- Male | 1 | 90 Female | 0 | 110 -----------+-----------+---------- Total | 1 | 200 atraised: | Q5 Xs stuation is caused by: the way X was raised Gender | Very Like Somewhat Not Very Not at al .c | Total -----------+-------------------------------------------------------+---------- Male | 5 24 32 29 0 | 90 Female | 8 27 31 38 2 | 110 -----------+-------------------------------------------------------+---------- Total | 13 51 63 67 2 | 200 | Q5 Xs | stuation | is caused | by: the | way X was | raised Gender | .d | Total -----------+-----------+---------- Male | 0 | 90 Female | 4 | 110 -----------+-----------+---------- Total | 4 | 200 atgenes: | Q7 Xs stuation is caused by: A genetic or inherited | problem Gender | Very Like Somewhat Not Very Not at al .c | Total -----------+-------------------------------------------------------+---------- Male | 12 33 23 13 9 | 90 Female | 20 42 22 21 3 | 110 -----------+-------------------------------------------------------+---------- Total | 32 75 45 34 12 | 200 | Q7 Xs | stuation | is caused | by: A | genetic or | inherited | problem Gender | .d | Total -----------+-----------+---------- Male | 0 | 90 Female | 2 | 110 -----------+-----------+---------- Total | 2 | 200 sdlive: | Q13 To have X as a neighbor? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 39 32 10 4 4 | 90 Female | 45 51 9 5 0 | 110 -----------+-------------------------------------------------------+---------- Total | 84 83 19 9 4 | 200 | Q13 To | have X as | a | neighbor? Gender | .d | Total -----------+-----------+---------- Male | 1 | 90 Female | 0 | 110 -----------+-----------+---------- Total | 1 | 200 sdsocial: | Q14 To spend time socializing with X? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 24 38 17 7 4 | 90 Female | 28 49 20 9 4 | 110 -----------+-------------------------------------------------------+---------- Total | 52 87 37 16 8 | 200 sdchild: | Q15 To have X care for your children or children you | know? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 11 15 22 35 5 | 90 Female | 7 24 41 36 2 | 110 -----------+-------------------------------------------------------+---------- Total | 18 39 63 71 7 | 200 | Q15 To | have X | care for | your | children | or | children | you know? Gender | .d | Total -----------+-----------+---------- Male | 2 | 90 Female | 0 | 110 -----------+-----------+---------- Total | 2 | 200 sdfriend: | Q16 To make friends with X? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 19 44 16 6 5 | 90 Female | 38 36 20 8 8 | 110 -----------+-------------------------------------------------------+---------- Total | 57 80 36 14 13 | 200 sdwork: | Q17 To work closely with X on a job? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 28 27 19 8 7 | 90 Female | 27 52 19 10 2 | 110 -----------+-------------------------------------------------------+---------- Total | 55 79 38 18 9 | 200 | Q17 To | work | closely | with X on | a job? Gender | .d | Total -----------+-----------+---------- Male | 1 | 90 Female | 0 | 110 -----------+-----------+---------- Total | 1 | 200 sdmarry: | Q18 To have X marry someone related to you? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 17 22 28 21 2 | 90 Female | 17 30 27 30 5 | 110 -----------+-------------------------------------------------------+---------- Total | 34 52 55 51 7 | 200 | Q18 To | have X | marry | someone | related to | you? Gender | .d | Total -----------+-----------+---------- Male | 0 | 90 Female | 1 | 110 -----------+-----------+---------- Total | 1 | 200 impown: | Q19 How likely is it that Xs situation will improve on | its own? Gender | Very Like Somewhat Not Very Not at al .c | Total -----------+-------------------------------------------------------+---------- Male | 4 24 31 29 2 | 90 Female | 5 17 48 38 2 | 110 -----------+-------------------------------------------------------+---------- Total | 9 41 79 67 4 | 200 imptreat: | Q20 How likely is it that Xs situation will improve | with treatment? Gender | Very Like Somewhat Not Very Not at al .c | Total -----------+-------------------------------------------------------+---------- Male | 44 36 4 3 3 | 90 Female | 47 52 6 1 4 | 110 -----------+-------------------------------------------------------+---------- Total | 91 88 10 4 7 | 200 stout: | Q23 Getting treatment would make X an outsider in Xs | community Gender | Strongly Agree Disagree Strongly .c | Total -----------+-------------------------------------------------------+---------- Male | 2 8 30 31 1 | 90 Female | 2 17 42 35 3 | 110 -----------+-------------------------------------------------------+---------- Total | 4 25 72 66 4 | 200 | Q23 Getting treatment | would make X an | outsider in Xs | community Gender | .d .g | Total -----------+----------------------+---------- Male | 0 18 | 90 Female | 1 10 | 110 -----------+----------------------+---------- Total | 1 28 | 200 stfriend: | Q24 If X let people know X is in treatment, X would | lose some friends Gender | Strongly Agree Disagree Strongly .c | Total -----------+-------------------------------------------------------+---------- Male | 4 19 28 20 2 | 90 Female | 0 27 33 23 3 | 110 -----------+-------------------------------------------------------+---------- Total | 4 46 61 43 5 | 200 | Q24 If X let people | know X is in | treatment, X would | lose some friends Gender | .d .g | Total -----------+----------------------+---------- Male | 1 16 | 90 Female | 0 24 | 110 -----------+----------------------+---------- Total | 1 40 | 200 stlimits: | Q25 No matter how much X achieves, opprtun limit if | oth knew X recvd treatment Gender | Strongly Agree Disagree Strongly .c | Total -----------+-------------------------------------------------------+---------- Male | 4 24 28 14 4 | 90 Female | 6 28 27 20 5 | 110 -----------+-------------------------------------------------------+---------- Total | 10 52 55 34 9 | 200 | Q25 No matter how | much X achieves, | opprtun limit if oth | knew X recvd | treatment Gender | .d .g | Total -----------+----------------------+---------- Male | 0 16 | 90 Female | 1 23 | 110 -----------+----------------------+---------- Total | 1 39 | 200 stuncom: | Q26 Being around X would make me feel uncomfortable Gender | Strongly Agree Disagree Strongly .c | Total -----------+-------------------------------------------------------+---------- Male | 5 13 31 19 2 | 90 Female | 3 17 44 22 3 | 110 -----------+-------------------------------------------------------+---------- Total | 8 30 75 41 5 | 200 | Q26 Being | around X | would make | me feel | uncomforta | ble Gender | .g | Total -----------+-----------+---------- Male | 20 | 90 Female | 21 | 110 -----------+-----------+---------- Total | 41 | 200 tcfam: | Q43 How Important: Turn to family for help Gender | Not at al 3 4 5 6 | Total -----------+-------------------------------------------------------+---------- Male | 0 1 1 3 3 | 90 Female | 3 2 0 8 8 | 110 -----------+-------------------------------------------------------+---------- Total | 3 3 1 11 11 | 200 | Q43 How Important: Turn to family for help Gender | 7 8 9 Very Impo | Total -----------+--------------------------------------------+---------- Male | 1 15 8 58 | 90 Female | 1 14 10 64 | 110 -----------+--------------------------------------------+---------- Total | 2 29 18 122 | 200 tcfriend: | Q44 How Important: Turn to friends for help Gender | Not at al 2 3 4 5 | Total -----------+-------------------------------------------------------+---------- Male | 2 1 0 3 11 | 90 Female | 0 0 3 3 8 | 110 -----------+-------------------------------------------------------+---------- Total | 2 1 3 6 19 | 200 | Q44 How Important: Turn to friends for help Gender | 6 7 8 9 Very Impo | Total -----------+-------------------------------------------------------+---------- Male | 5 12 11 11 33 | 90 Female | 5 16 20 12 43 | 110 -----------+-------------------------------------------------------+---------- Total | 10 28 31 23 76 | 200 | Q44 How | Important: | Turn to | friends | for help Gender | .d | Total -----------+-----------+---------- Male | 1 | 90 Female | 0 | 110 -----------+-----------+---------- Total | 1 | 200 tcdoc: | Q46 How Important: Go to a general medical doctor for | help Gender | Not at al 2 3 4 5 | Total -----------+-------------------------------------------------------+---------- Male | 2 1 0 2 4 | 90 Female | 2 1 1 0 10 | 110 -----------+-------------------------------------------------------+---------- Total | 4 2 1 2 14 | 200 | Q46 How Important: Go to a general medical doctor for | help Gender | 6 7 8 9 Very Impo | Total -----------+-------------------------------------------------------+---------- Male | 4 4 10 9 52 | 90 Female | 3 7 6 11 67 | 110 -----------+-------------------------------------------------------+---------- Total | 7 11 16 20 119 | 200 | Q46 How Important: Go | to a general medical | doctor for help Gender | .c .d | Total -----------+----------------------+---------- Male | 2 0 | 90 Female | 1 1 | 110 -----------+----------------------+---------- Total | 3 1 | 200 gvjob: | Q49 Government Responsibility: Provide a job for X if | X wants one Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 43 35 7 3 2 | 90 Female | 51 39 15 3 2 | 110 -----------+-------------------------------------------------------+---------- Total | 94 74 22 6 4 | 200 gvhealth: | Q50 Government Responsibility: Provide health care for | X Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 68 17 4 0 1 | 90 Female | 74 31 4 1 0 | 110 -----------+-------------------------------------------------------+---------- Total | 142 48 8 1 1 | 200 gvhous: | Q51 Government Responsibility: Provide housing for X | if X can not afford it Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 33 33 15 5 4 | 90 Female | 31 47 18 8 6 | 110 -----------+-------------------------------------------------------+---------- Total | 64 80 33 13 10 | 200 gvdisben: | Q53 Government Responsibility: Provide disability | benefits for X Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 40 27 9 9 5 | 90 Female | 40 44 13 9 3 | 110 -----------+-------------------------------------------------------+---------- Total | 80 71 22 18 8 | 200 | Q53 | Government | Responsibi | lity: | Provide | disability | benefits | for X Gender | .d | Total -----------+-----------+---------- Male | 0 | 90 Female | 1 | 110 -----------+-----------+---------- Total | 1 | 200 ctxfdoc: | Q56 Forced by law to be examined at a clinic or by a | doctor? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 35 29 12 13 1 | 90 Female | 40 36 14 18 0 | 110 -----------+-------------------------------------------------------+---------- Total | 75 65 26 31 1 | 200 | Q56 Forced | by law to | be | examined | at a | clinic or | by a | doctor? Gender | .d | Total -----------+-----------+---------- Male | 0 | 90 Female | 2 | 110 -----------+-----------+---------- Total | 2 | 200 ctxfmed: | Q57 Forced by law to take medication prescribed by a | doctor? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 45 23 7 12 3 | 90 Female | 59 28 12 7 3 | 110 -----------+-------------------------------------------------------+---------- Total | 104 51 19 19 6 | 200 | Q57 Forced | by law to | take | medication | prescribed | by a | doctor? Gender | .d | Total -----------+-----------+---------- Male | 0 | 90 Female | 1 | 110 -----------+-----------+---------- Total | 1 | 200 ctxfhos: | Q58 Forced by law to be hospitalized for treatment? Gender | Definitel Probably Probably Definitel .c | Total -----------+-------------------------------------------------------+---------- Male | 27 23 16 17 5 | 90 Female | 32 38 21 13 6 | 110 -----------+-------------------------------------------------------+---------- Total | 59 61 37 30 11 | 200 | Q58 Forced | by law to | be | hospitaliz | ed for | treatment? Gender | .d | Total -----------+-----------+---------- Male | 2 | 90 Female | 0 | 110 -----------+-----------+---------- Total | 2 | 200 cause: | Q62 Is Xs situation caused by depression, asthma, | schizophrenia, stress, other? Gender | Depressio Asthma Schizophr Stress Something | Total -----------+-------------------------------------------------------+---------- Male | 40 9 8 13 2 | 90 Female | 44 21 9 14 2 | 110 -----------+-------------------------------------------------------+---------- Total | 84 30 17 27 4 | 200 | Q62 Is Xs situation | caused by depression, | asthma, | schizophrenia, | stress, other? Gender | .c .h | Total -----------+----------------------+---------- Male | 5 13 | 90 Female | 6 14 | 110 -----------+----------------------+---------- Total | 11 27 | 200 puboften: | Q72 How often you see someone w/a serious mental | health problem in public place? Gender | Frequentl Occasiona Rarely Never .c | Total -----------+-------------------------------------------------------+---------- Male | 16 20 21 32 1 | 90 Female | 12 29 32 36 1 | 110 -----------+-------------------------------------------------------+---------- Total | 28 49 53 68 2 | 200 pubfright: | Q73 How frightening are people seen in public who seem | to have mental hlth prob? Gender | Very Frig Somewhat Not Very Not at al .a | Total -----------+-------------------------------------------------------+---------- Male | 1 22 13 23 29 | 90 Female | 4 18 23 31 32 | 110 -----------+-------------------------------------------------------+---------- Total | 5 40 36 54 61 | 200 | Q73 How frightening | are people seen in | public who seem to | have mental hlth | prob? Gender | .c .d | Total -----------+----------------------+---------- Male | 1 1 | 90 Female | 2 0 | 110 -----------+----------------------+---------- Total | 3 1 | 200 pubsymp: | Q74 How much sympathy for people w/mental hlth problm | that you see in public? Gender | No Sympat A Little Quite a b A Great d .a | Total -----------+-------------------------------------------------------+---------- Male | 1 13 24 19 32 | 90 Female | 6 8 40 25 29 | 110 -----------+-------------------------------------------------------+---------- Total | 7 21 64 44 61 | 200 | Q74 How | much | sympathy | for people | w/mental | hlth | problm | that you | see in | public? Gender | .c | Total -----------+-----------+---------- Male | 1 | 90 Female | 2 | 110 -----------+-----------+---------- Total | 3 | 200 trust: | Q75 Would you say people can be trusted or need to be | careful dealing w/people? Gender | Most peop Need to b .a .c .d | Total -----------+-------------------------------------------------------+---------- Male | 14 47 29 0 0 | 90 Female | 13 71 24 1 1 | 110 -----------+-------------------------------------------------------+---------- Total | 27 118 53 1 1 | 200 gender: | Gender Gender | Male Female | Total -----------+----------------------+---------- Male | 90 0 | 90 Female | 0 110 | 110 -----------+----------------------+---------- Total | 90 110 | 200 wrkstat: | Current employment status Gender | Employed- Employed- Employed- Helping f Unemploye | Total -----------+-------------------------------------------------------+---------- Male | 40 8 3 3 5 | 90 Female | 41 5 0 0 10 | 110 -----------+-------------------------------------------------------+---------- Total | 81 13 3 3 15 | 200 | Current employment status Gender | Student,s Retired Housewife Other, no .d | Total -----------+-------------------------------------------------------+---------- Male | 6 8 15 1 1 | 90 Female | 4 21 25 1 3 | 110 -----------+-------------------------------------------------------+---------- Total | 10 29 40 2 4 | 200 marital: | Marital status Gender | Married Widowed Divorced Separated Living as | Total -----------+-------------------------------------------------------+---------- Male | 54 8 3 4 7 | 90 Female | 58 8 7 2 14 | 110 -----------+-------------------------------------------------------+---------- Total | 112 16 10 6 21 | 200 | Marital status Gender | Single, n .d | Total -----------+----------------------+---------- Male | 14 0 | 90 Female | 20 1 | 110 -----------+----------------------+---------- Total | 34 1 | 200 edudeg: | Education II-highest education level Gender | No formal Lowest fo Above low Higher se Above hig | Total -----------+-------------------------------------------------------+---------- Male | 14 25 15 17 12 | 90 Female | 15 25 16 27 15 | 110 -----------+-------------------------------------------------------+---------- Total | 29 50 31 44 27 | 200 | Education | II-highest | education | level Gender | Universit | Total -----------+-----------+---------- Male | 7 | 90 Female | 12 | 110 -----------+-----------+---------- Total | 19 | 200 . . log close log: D:\wf\work\wf5-sgc1b-try.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . . * clone source variables . do wf5-sgc2a-clone.do . capture log close . log using wf5-sgc2a-clone, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc2a-clone.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc2a-clone.do \ for stata 9 . // task: make clones of existing variables . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // create locals . . local date "2008-10-24" . local tag "wf5-sgc2a.do jsl `date'" . . // #2 . // load data . . use wf-sgc-source, clear (Workflow data for SGC renaming example \ 2008-04-03) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. . . // #3 . // loop through variables and create clones . . unab varlist : _all . foreach varname in `varlist' { 2. clonevar S`varname' = `varname' 3. note S`varname': Source variable for `varname' \ `tag' 4. note `varname': Clone of source variable S`varname' \ `tag' 5. } (4 missing values generated) (1 missing value generated) (2 missing values generated) (3 missing values generated) (2 missing values generated) (7 missing values generated) (1 missing value generated) (8 missing values generated) (6 missing values generated) (14 missing values generated) (5 missing values generated) (8 missing values generated) (9 missing values generated) (13 missing values generated) (10 missing values generated) (8 missing values generated) (4 missing values generated) (7 missing values generated) (33 missing values generated) (46 missing values generated) (49 missing values generated) (46 missing values generated) (1 missing value generated) (4 missing values generated) (4 missing values generated) (1 missing value generated) (10 missing values generated) (9 missing values generated) (3 missing values generated) (7 missing values generated) (13 missing values generated) (38 missing values generated) (2 missing values generated) (65 missing values generated) (64 missing values generated) (55 missing values generated) (4 missing values generated) (1 missing value generated) . . // #4 . // check the variables . . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id_iu 200 200 1772875 1100107 2601091 Respondent Number cntry_iu 200 8 17.495 11 26 IU Country Number vignum 200 12 6.305 1 12 Vignette serious 196 4 1.709184 1 4 Q1 How serious would you ... opfam 199 2 1.693467 1 2 Q2_1 What X should do:Tal... opfriend 198 2 1.833333 1 2 Q2_2 What X should do:Tal... tospi 197 2 1.954315 1 2 Q2_7 What X should do:Go ... tonpm 198 2 1.969697 1 2 Q2_8 What X should do:Tak... oppme 193 2 1.896373 1 2 Q2_9 What X should do:Tak... opforg 199 2 1.974874 1 2 Q2_14 What X should do:Tr... atdisease 192 4 2.739583 1 4 Q4 Xs stuation is caused ... atraised 194 4 2.948454 1 4 Q5 Xs stuation is caused ... atgenes 186 4 2.435484 1 4 Q7 Xs stuation is caused ... sdlive 195 4 1.758974 1 4 Q13 To have X as a neighbor? sdsocial 192 4 2.088542 1 4 Q14 To spend time sociali... sdchild 191 4 2.979058 1 4 Q15 To have X care for yo... sdfriend 187 4 2.037433 1 4 Q16 To make friends with X? sdwork 190 4 2.1 1 4 Q17 To work closely with ... sdmarry 192 4 2.640625 1 4 Q18 To have X marry someo... impown 196 4 3.040816 1 4 Q19 How likely is it that... imptreat 193 4 1.621762 1 4 Q20 How likely is it that... stout 167 4 3.197605 1 4 Q23 Getting treatment wou... stfriend 154 4 2.928571 1 4 Q24 If X let people know ... stlimits 151 4 2.748344 1 4 Q25 No matter how much X ... stuncom 154 4 2.967532 1 4 Q26 Being around X would ... tcfam 200 9 8.825 1 10 Q43 How Important: Turn t... tcfriend 199 10 8.055276 1 10 Q44 How Important: Turn t... tcdoc 196 10 8.704082 1 10 Q46 How Important: Go to ... gvjob 196 4 1.693878 1 4 Q49 Government Responsibi... gvhealth 199 4 1.336683 1 4 Q50 Government Responsibi... gvhous 190 4 1.973684 1 4 Q51 Government Responsibi... gvdisben 191 4 1.884817 1 4 Q53 Government Responsibi... ctxfdoc 197 4 2.06599 1 4 Q56 Forced by law to be e... ctxfmed 193 4 1.756477 1 4 Q57 Forced by law to take... ctxfhos 187 4 2.203209 1 4 Q58 Forced by law to be h... cause 162 5 1.993827 1 5 Q62 Is Xs situation cause... puboften 198 4 2.813131 1 4 Q72 How often you see som... pubfright 135 4 3.02963 1 4 Q73 How frightening are p... pubsymp 136 4 3.066176 1 4 Q74 How much sympathy for... trust 145 2 1.813793 1 2 Q75 Would you say people ... gender 200 2 1.55 1 2 Gender age 200 62 44.395 18 97 Age wrkstat 196 9 4.112245 1 10 Current employment status marital 199 6 2.547739 1 6 Marital status edudeg 200 6 2.235 0 5 Education II-highest educ... Sid_iu 200 200 1772875 1100107 2601091 Respondent Number Scntry_iu 200 8 17.495 11 26 IU Country Number Svignum 200 12 6.305 1 12 Vignette Sserious 196 4 1.709184 1 4 Q1 How serious would you ... Sopfam 199 2 1.693467 1 2 Q2_1 What X should do:Tal... Sopfriend 198 2 1.833333 1 2 Q2_2 What X should do:Tal... Stospi 197 2 1.954315 1 2 Q2_7 What X should do:Go ... Stonpm 198 2 1.969697 1 2 Q2_8 What X should do:Tak... Soppme 193 2 1.896373 1 2 Q2_9 What X should do:Tak... Sopforg 199 2 1.974874 1 2 Q2_14 What X should do:Tr... Satdisease 192 4 2.739583 1 4 Q4 Xs stuation is caused ... Satraised 194 4 2.948454 1 4 Q5 Xs stuation is caused ... Satgenes 186 4 2.435484 1 4 Q7 Xs stuation is caused ... Ssdlive 195 4 1.758974 1 4 Q13 To have X as a neighbor? Ssdsocial 192 4 2.088542 1 4 Q14 To spend time sociali... Ssdchild 191 4 2.979058 1 4 Q15 To have X care for yo... Ssdfriend 187 4 2.037433 1 4 Q16 To make friends with X? Ssdwork 190 4 2.1 1 4 Q17 To work closely with ... Ssdmarry 192 4 2.640625 1 4 Q18 To have X marry someo... Simpown 196 4 3.040816 1 4 Q19 How likely is it that... Simptreat 193 4 1.621762 1 4 Q20 How likely is it that... Sstout 167 4 3.197605 1 4 Q23 Getting treatment wou... Sstfriend 154 4 2.928571 1 4 Q24 If X let people know ... Sstlimits 151 4 2.748344 1 4 Q25 No matter how much X ... Sstuncom 154 4 2.967532 1 4 Q26 Being around X would ... Stcfam 200 9 8.825 1 10 Q43 How Important: Turn t... Stcfriend 199 10 8.055276 1 10 Q44 How Important: Turn t... Stcdoc 196 10 8.704082 1 10 Q46 How Important: Go to ... Sgvjob 196 4 1.693878 1 4 Q49 Government Responsibi... Sgvhealth 199 4 1.336683 1 4 Q50 Government Responsibi... Sgvhous 190 4 1.973684 1 4 Q51 Government Responsibi... Sgvdisben 191 4 1.884817 1 4 Q53 Government Responsibi... Sctxfdoc 197 4 2.06599 1 4 Q56 Forced by law to be e... Sctxfmed 193 4 1.756477 1 4 Q57 Forced by law to take... Sctxfhos 187 4 2.203209 1 4 Q58 Forced by law to be h... Scause 162 5 1.993827 1 5 Q62 Is Xs situation cause... Spuboften 198 4 2.813131 1 4 Q72 How often you see som... Spubfright 135 4 3.02963 1 4 Q73 How frightening are p... Spubsymp 136 4 3.066176 1 4 Q74 How much sympathy for... Strust 145 2 1.813793 1 2 Q75 Would you say people ... Sgender 200 2 1.55 1 2 Gender Sage 200 62 44.395 18 97 Age Swrkstat 196 9 4.112245 1 10 Current employment status Smarital 199 6 2.547739 1 6 Marital status Sedudeg 200 6 2.235 0 5 Education II-highest educ... -------------------------------------------------------------------------------- . . // #5 . // closeup and save data . . quietly compress . note: wf-sgc01.dta \ create clones of source variables \ `tag' . label data "Workflow data for SGC renaming example \ `date'" . * in stata 10 and later: datasignature set, reset . save wf-sgc01, replace file wf-sgc01.dta saved . . * check the dataset . use wf-sgc01, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . note _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 . . log close log: D:\wf\work\wf5-sgc2a-clone.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . . * rename variables . do wf5-sgc2b-rename-dump.do . capture log close . log using wf5-sgc2b-rename-dump, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc2b-rename-dump.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc2b-rename-dump.do \ for stata 9 . // task: create dummy rename commands . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-sgc01, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 . . // #2 . // drop the source variables that will not be renamed & sort names . . * drop S(ource) variables since they will not be renamed . drop S* . . * create an alphabetized list of the non-S varaibles . aorder . . // #3 . // loop through the names and create baseline rename commands . . unab varlist : _all . file open myfile using wf5-sgc2b-rename-dummy.doi, write replace . . foreach varname in `varlist' { 2. file write myfile "*rename `varname'" _col(22) "`varname'" _newline 3. } . file close myfile . . log close log: D:\wf\work\wf5-sgc2b-rename-dump.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-sgc2c-rename.do . capture log close . log using wf5-sgc2c-rename, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc2c-rename.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc2c-rename.do \ for stata 9 . // include: wf5-sgc2b-rename-revised.doi . // task: rename variables using commands generated in step3a. . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define locals . . local date "2008-10-24" . local tag "wf5-sgc2c.do jsl `date'." . . // #2 . // load the data . . use wf-sgc01, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 . . // #3 . // include the edited rename commands . . include wf5-sgc2b-rename-revised.doi . // include: wf5-sgc2b-rename-revised.doi . // used by: wf5-sgc2c-rename.do \ for stata 9 . // task: rename variables for SGC . // project: workflow chapter . // author: scott long \ 2008-10-24 . . *rename age age . *rename atdisease atdisease . rename atgenes atgenet . *rename atraised atraised . *rename cause cause . *rename cntry_iu cntry_iu . rename ctxfdoc clawdoc . rename ctxfhos clawhosp . rename ctxfmed clawpmed . *rename edudeg edudeg . *rename gender gender . rename gvdisben gvdisab . *rename gvhealth gvhealth . rename gvhous gvhouse . *rename gvjob gvjob . *rename id_iu id_iu . *rename impown impown . *rename imptreat imptreat . *rename marital marital . *rename opfam opfam . rename opforg opforget . *rename opfriend opfriend . rename oppme oppremed . rename pubfright pubfrght . *rename puboften puboften . *rename pubsymp pubsymp . *rename sdchild sdchild . *rename sdfriend sdfriend . rename sdlive sdneighb . *rename sdmarry sdmarry . *rename sdsocial sdsocial . *rename sdwork sdwork . *rename serious serious . *rename stfriend stfriend . *rename stlimits stlimits . *rename stout stout . rename stuncom stuncmft . *rename tcdoc tcdoc . *rename tcfam tcfam . *rename tcfriend tcfriend . rename tonpm opnomed . rename tospi opspirit . *rename trust trust . *rename vignum vignum . *rename wrkstat wrkstat . . . // #4 . // closeup and save data . . quietly compress . note: wf-sgc02.dta \ rename source variables \ `tag' . label data "Workflow data for SGC renaming example \ `date'" . * in stata 10 and later: datasignature set, reset . save wf-sgc02, replace file wf-sgc02.dta saved . . * check data . use wf-sgc02, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 3. wf-sgc02.dta \ rename source variables \ wf5-sgc2c.do jsl 2008-10-24. . . // #5 . // check new names . . set linesize 120 . nmlab id_iu Respondent Number cntry_iu IU Country Number vignum Vignette serious Q1 How serious would you consider Xs situation to be? opfam Q2_1 What X should do:Talk to family opfriend Q2_2 What X should do:Talk to friends opspirit Q2_7 What X should do:Go to spiritual or traditional healer opnomed Q2_8 What X should do:Take non-prescription medication oppremed Q2_9 What X should do:Take prescription medication opforget Q2_14 What X should do:Try to forget about it atdisease Q4 Xs stuation is caused by: A brain disease or disorder atraised Q5 Xs stuation is caused by: the way X was raised atgenet Q7 Xs stuation is caused by: A genetic or inherited problem sdneighb Q13 To have X as a neighbor? sdsocial Q14 To spend time socializing with X? sdchild Q15 To have X care for your children or children you know? sdfriend Q16 To make friends with X? sdwork Q17 To work closely with X on a job? sdmarry Q18 To have X marry someone related to you? impown Q19 How likely is it that Xs situation will improve on its own? imptreat Q20 How likely is it that Xs situation will improve with treatment? stout Q23 Getting treatment would make X an outsider in Xs community stfriend Q24 If X let people know X is in treatment, X would lose some friends stlimits Q25 No matter how much X achieves, opprtun limit if oth knew X recvd treatment stuncmft Q26 Being around X would make me feel uncomfortable tcfam Q43 How Important: Turn to family for help tcfriend Q44 How Important: Turn to friends for help tcdoc Q46 How Important: Go to a general medical doctor for help gvjob Q49 Government Responsibility: Provide a job for X if X wants one gvhealth Q50 Government Responsibility: Provide health care for X gvhouse Q51 Government Responsibility: Provide housing for X if X can not afford it gvdisab Q53 Government Responsibility: Provide disability benefits for X clawdoc Q56 Forced by law to be examined at a clinic or by a doctor? clawpmed Q57 Forced by law to take medication prescribed by a doctor? clawhosp Q58 Forced by law to be hospitalized for treatment? cause Q62 Is Xs situation caused by depression, asthma, schizophrenia, stress, other? puboften Q72 How often you see someone w/a serious mental health problem in public place? pubfrght Q73 How frightening are people seen in public who seem to have mental hlth prob? pubsymp Q74 How much sympathy for people w/mental hlth problm that you see in public? trust Q75 Would you say people can be trusted or need to be careful dealing w/people? gender Gender age Age wrkstat Current employment status marital Marital status edudeg Education II-highest education level Sid_iu Respondent Number Scntry_iu IU Country Number Svignum Vignette Sserious Q1 How serious would you consider Xs situation to be? Sopfam Q2_1 What X should do:Talk to family Sopfriend Q2_2 What X should do:Talk to friends Stospi Q2_7 What X should do:Go to spiritual or traditional healer Stonpm Q2_8 What X should do:Take non-prescription medication Soppme Q2_9 What X should do:Take prescription medication Sopforg Q2_14 What X should do:Try to forget about it Satdisease Q4 Xs stuation is caused by: A brain disease or disorder Satraised Q5 Xs stuation is caused by: the way X was raised Satgenes Q7 Xs stuation is caused by: A genetic or inherited problem Ssdlive Q13 To have X as a neighbor? Ssdsocial Q14 To spend time socializing with X? Ssdchild Q15 To have X care for your children or children you know? Ssdfriend Q16 To make friends with X? Ssdwork Q17 To work closely with X on a job? Ssdmarry Q18 To have X marry someone related to you? Simpown Q19 How likely is it that Xs situation will improve on its own? Simptreat Q20 How likely is it that Xs situation will improve with treatment? Sstout Q23 Getting treatment would make X an outsider in Xs community Sstfriend Q24 If X let people know X is in treatment, X would lose some friends Sstlimits Q25 No matter how much X achieves, opprtun limit if oth knew X recvd treatment Sstuncom Q26 Being around X would make me feel uncomfortable Stcfam Q43 How Important: Turn to family for help Stcfriend Q44 How Important: Turn to friends for help Stcdoc Q46 How Important: Go to a general medical doctor for help Sgvjob Q49 Government Responsibility: Provide a job for X if X wants one Sgvhealth Q50 Government Responsibility: Provide health care for X Sgvhous Q51 Government Responsibility: Provide housing for X if X can not afford it Sgvdisben Q53 Government Responsibility: Provide disability benefits for X Sctxfdoc Q56 Forced by law to be examined at a clinic or by a doctor? Sctxfmed Q57 Forced by law to take medication prescribed by a doctor? Sctxfhos Q58 Forced by law to be hospitalized for treatment? Scause Q62 Is Xs situation caused by depression, asthma, schizophrenia, stress, other? Spuboften Q72 How often you see someone w/a serious mental health problem in public place? Spubfright Q73 How frightening are people seen in public who seem to have mental hlth prob? Spubsymp Q74 How much sympathy for people w/mental hlth problm that you see in public? Strust Q75 Would you say people can be trusted or need to be careful dealing w/people? Sgender Gender Sage Age Swrkstat Current employment status Smarital Marital status Sedudeg Education II-highest education level . . log close log: D:\wf\work\wf5-sgc2c-rename.log log type: text closed on: 24 Oct 2008, 09:41:08 ------------------------------------------------------------------------------------------------------------------------ . exit end of do-file . . * change variable labels . do wf5-sgc3a-varlab-dump.do . capture log close . log using wf5-sgc3a-varlab-dump, replace text ------------------------------------------------------------------------------------------------------------------------ log: D:\wf\work\wf5-sgc3a-varlab-dump.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc3a-varlab-dump.do \ for stata 9 . // task: step 3a: create dummy commands for variable labels . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-sgc02, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . . // #2 . // drop S variables since they will not be relabeled . . drop S* . . // #3 . // create list of all variables and dump var label commands . . * get a sorted list of names . aorder . unab varlist : _all . . file open myfile using wf5-sgc3a-varlab-dummy.doi, write replace . . foreach varname in `varlist' { 2. local varlabel : variable label `varname' 3. file write myfile "label var `varname' " /// > _col(24) `""`varlabel'""' _newline 4. } . . file close myfile . . log close log: D:\wf\work\wf5-sgc3a-varlab-dump.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-sgc3b-varlab-revise.do . capture log close . log using wf5-sgc3b-varlab-revise, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc3b-varlab-revise.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc3b-varlab-revise.do \ for stata 9 . // include: requires wf5-sgc3a-varlab-revised.doi . // task: create new variable labels . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define locals . . local date "2008-10-24" . local tag "wf5-sgc3b.do jsl `date'." . . // #2 . // load data . . use wf-sgc02, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 3. wf-sgc02.dta \ rename source variables \ wf5-sgc2c.do jsl 2008-10-24. . . // #3 . // create a new language for revised labels . . label language original, new copy // copy of default language (language original now current language) . label language default . note: language original uses the original, unrevised labels; language /// > default uses revised labels \ `tag' . . // #4 . // include the edited file with variable labels . . include wf5-sgc3a-varlab-revised.doi . // include: wf5-sgc3a-varlab-revised.doi . // used by: wf5-sgc3b-varlab-revise.do \ for stata 9 . // task: revised variable labels for SGC . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . label var age "Age in years" . label var atdisease "Q04 Cause is brain disorder" . label var atgenet "Q07 Cause is genetic" . label var atraised "Q05 Cause is way X was raised" . label var cause "Q62 Xs situation cased by what?" . label var clawdoc "Q56 Coerce X to doctor by law" . label var clawhosp "Q58 Coerce X to hospital by law" . label var clawpmed "Q57 Coerce X use prescrip med by law" . label var cntry_iu "IU country number" . label var edudeg "Educ II highest level" . label var gender "Gender (1=male 2=female)" . label var gvdisab "Q53 Govt should provide disability benefits?" . label var gvhealth "Q50 Govt should provide health care?" . label var gvhouse "Q51 Govt should provide housing?" . label var gvjob "Q49 Govt should provide job?" . label var id_iu "IU respondent id" . label var impown "Q19 How likely X improve on own?" . label var imptreat "Q20 How likely X improve w treatment?" . label var opfam "Q02_01 X talk to family?" . label var opforget "Q02_14 X forget about it?" . label var opfriend "Q02_02 X talk to friends?" . label var opnomed "Q02_08 X take non-prescrip meds?" . label var oppremed "Q02_09 X take prescrip meds?" . label var opspirit "Q02_07 X see spirit/trad healer?" . label var pubfrght "Q73 How frightening was MH person?" . label var puboften "Q72 How often see MH person in public?" . label var pubsymp "Q74 How sympathethic to MH person?" . label var sdchild "Q15 Would let X care for children?" . label var sdfriend "Q16 Would be friends w X?" . label var sdmarry "Q18 Would let X marry relative?" . label var sdneighb "Q13 Would have X as neighbor?" . label var sdsocial "Q14 Would socialize w X?" . label var sdwork "Q17 Would work w X on job?" . label var serious "Q01 How serious is Xs problem?" . label var stfriend "Q24 Treatment makes X lose friends?" . label var stlimits "Q25 Treatment limits Xs opportunities?" . label var stout "Q23 Treatment makes X an outsider?" . label var stuncmft "Q26 X makes me uncomfortable?" . label var tcdoc "Q46 Med doctor help important?" . label var tcfam "Q43 Family help important?" . label var tcfriend "Q44 Friends help important?" . label var trust "Q75 Can people be trusted?" . label var vignum "Vignette number" . . . // #5 . // closeup and save data . . quietly compress . note: wf-sgc03.dta \ revised var labels for source & default languages \ `tag' . label data "Workflow data for SGC renaming example \ `date'" . * in stata 10 and later: datasignature set, reset . save wf-sgc03, replace file wf-sgc03.dta saved . . // #6 . // verify data and check names . . use wf-sgc03, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 3. wf-sgc02.dta \ rename source variables \ wf5-sgc2c.do jsl 2008-10-24. 4. language original uses the original, unrevised labels; language default uses revised labels \ wf5-sgc3b.do jsl 2008-10-24. 5. wf-sgc03.dta \ revised var labels for source & default languages \ wf5-sgc3b.do jsl 2008-10-24. . drop S* . . * default language . nmlab tcfam tcfriend vignum tcfam Q43 Family help important? tcfriend Q44 Friends help important? vignum Vignette number . . * original language . label language original . nmlab tcfam tcfriend vignum tcfam Q43 How Important: Turn to family for help tcfriend Q44 How Important: Turn to friends for help vignum Vignette . . log close log: D:\wf\work\wf5-sgc3b-varlab-revise.log log type: text closed on: 24 Oct 2008, 09:41:08 -------------------------------------------------------------------------------- . exit end of do-file . . * change value labels . do wf5-sgc4a-vallab-check.do . capture log close . log using wf5-sgc4a-vallab-check, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc4a-vallab-check.log log type: text opened on: 24 Oct 2008, 09:41:08 . . // program: wf5-sgc4a-vallab-check.do \ for stata 9 . // task: check the value labels currently being used . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-sgc03, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 3. wf-sgc02.dta \ rename source variables \ wf5-sgc2c.do jsl 2008-10-24. 4. language original uses the original, unrevised labels; language default uses revised labels \ wf5-sgc3b.do jsl 2008-10-24. 5. wf-sgc03.dta \ revised var labels for source & default languages \ wf5-sgc3b.do jsl 2008-10-24. . . // #2 . // inventory of existing value labels . . labelbook `valdeflist', length(10) -------------------------------------------------------------------------------- value label Ldist -------------------------------------------------------------------------------- values labels range: [1,4] string length: [16,20] N: 4 unique at full length: yes gaps: no unique at length 10: no missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Definitely Willing 2 Probably Willing 3 Probably Unwilling 4 Definitely Unwilling in default attached to sdneighb sdsocial sdchild sdfriend sdwork sdmarry Ssdlive Ssdsocial Ssdchild Ssdfriend Ssdwork Ssdmarry in original attached to sdneighb sdsocial sdchild sdfriend sdwork sdmarry Ssdlive Ssdsocial Ssdchild Ssdfriend Ssdwork Ssdmarry -------------------------------------------------------------------------------- value label Ldummy -------------------------------------------------------------------------------- values labels range: [1,2] string length: [2,3] N: 2 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Yes 2 No in default attached to opfam opfriend opspirit opnomed oppremed opforget Sopfam Sopfriend Stospi Stonpm Soppme Sopforg in original attached to opfam opfriend opspirit opnomed oppremed opforget Sopfam Sopfriend Stospi Stonpm Soppme Sopforg -------------------------------------------------------------------------------- value label Limport -------------------------------------------------------------------------------- values labels range: [1,10] string length: [14,20] N: 2 unique at full length: yes gaps: yes unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Not at all Important 10 Very Important in default attached to tcfam tcfriend tcdoc Stcfam Stcfriend Stcdoc in original attached to tcfam tcfriend tcdoc Stcfam Stcfriend Stcdoc -------------------------------------------------------------------------------- value label Llikely -------------------------------------------------------------------------------- values labels range: [1,4] string length: [11,17] N: 4 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Very Likely 2 Somewhat Likely 3 Not Very Likely 4 Not at all Likely in default attached to atdisease atraised atgenet impown imptreat Satdisease Satraised Satgenes Simpown Simptreat in original attached to atdisease atraised atgenet impown imptreat Satdisease Satraised Satgenes Simpown Simptreat -------------------------------------------------------------------------------- value label Llikert -------------------------------------------------------------------------------- values labels range: [1,4] string length: [5,17] N: 4 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Strongly agree 2 Agree 3 Disagree 4 Strongly disagree in default attached to stout stfriend stlimits stuncmft Sstout Sstfriend Sstlimits Sstuncom in original attached to stout stfriend stlimits stuncmft Sstout Sstfriend Sstlimits Sstuncom -------------------------------------------------------------------------------- value label Lrespons -------------------------------------------------------------------------------- values labels range: [1,4] string length: [18,22] N: 4 unique at full length: yes gaps: no unique at length 10: no missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Definitely should be 2 Probably should be 3 Probably shouldnt be 4 Definitely shouldnt be in default attached to gvjob gvhealth gvhouse gvdisab clawdoc clawpmed clawhosp Sgvjob Sgvhealth Sgvhous Sgvdisben Sctxfdoc Sctxfmed Sctxfhos in original attached to gvjob gvhealth gvhouse gvdisab clawdoc clawpmed clawhosp Sgvjob Sgvhealth Sgvhous Sgvdisben Sctxfdoc Sctxfmed Sctxfhos -------------------------------------------------------------------------------- value label age -------------------------------------------------------------------------------- values labels range: [18,97] string length: [8,11] N: 2 unique at full length: yes gaps: yes unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 18 18 years 97 97 or older in default attached to age Sage in original attached to age Sage -------------------------------------------------------------------------------- value label cause -------------------------------------------------------------------------------- values labels range: [1,7] string length: [6,20] N: 6 unique at full length: yes gaps: yes unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Depression 2 Asthma 3 Schizophrenia 4 Stress 5 Something else 7 More than one answer in default attached to cause Scause in original attached to cause Scause -------------------------------------------------------------------------------- value label cntry_iu -------------------------------------------------------------------------------- values labels range: [11,28] string length: [5,14] N: 18 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 11 Argentina 12 Bangladesh 13 Brazil 14 Bulgaria 15 Cyprus 16 Finland 17 Germany 18 Hungary 19 Iceland 20 Japan 21 Korea Republic 22 Nepal 23 New Zealand 24 Philippines 25 South Africa 26 Spain 27 United Kingdom 28 United States in default attached to cntry_iu Scntry_iu in original attached to cntry_iu Scntry_iu -------------------------------------------------------------------------------- value label edudeg -------------------------------------------------------------------------------- values labels range: [0,5] string length: [23,28] N: 6 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 0 No formal qualification 1 Lowest formal qualification 2 Above lowest qualification 3 Higher secondary completed 4 Above higher secondary level 5 University degree completed in default attached to edudeg Sedudeg in original attached to edudeg Sedudeg -------------------------------------------------------------------------------- value label gender -------------------------------------------------------------------------------- values labels range: [1,2] string length: [4,6] N: 2 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Male 2 Female in default attached to gender Sgender in original attached to gender Sgender -------------------------------------------------------------------------------- value label marital -------------------------------------------------------------------------------- values labels range: [1,6] string length: [7,29] N: 6 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Married 2 Widowed 3 Divorced 4 Separated, but married 5 Living as a couple/Cohabiting 6 Single, never married in default attached to marital Smarital in original attached to marital Smarital -------------------------------------------------------------------------------- value label pubfright -------------------------------------------------------------------------------- values labels range: [1,4] string length: [16,22] N: 4 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Very Frightening 2 Somewhat Frightening 3 Not Very Frightening 4 Not at all Frightening in default attached to pubfrght Spubfright in original attached to pubfrght Spubfright -------------------------------------------------------------------------------- value label puboften -------------------------------------------------------------------------------- values labels range: [1,4] string length: [5,12] N: 4 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Frequently 2 Occasionally 3 Rarely 4 Never in default attached to puboften Spuboften in original attached to puboften Spuboften -------------------------------------------------------------------------------- value label pubsymp -------------------------------------------------------------------------------- values labels range: [1,4] string length: [17,24] N: 4 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 No Sympathy at all 2 A Little Sympathy 3 Quite a bit of Sympathy 4 A Great deal of Sympathy in default attached to pubsymp Spubsymp in original attached to pubsymp Spubsymp -------------------------------------------------------------------------------- value label serious -------------------------------------------------------------------------------- values labels range: [0,4] string length: [3,18] N: 5 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 0 NAP 1 Very serious 2 Moderately serious 3 Not very serious 4 Not at all serious in default attached to serious Sserious in original attached to serious Sserious -------------------------------------------------------------------------------- value label trust -------------------------------------------------------------------------------- values labels range: [1,2] string length: [23,26] N: 2 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Most people can be trusted 2 Need to be very careful in default attached to trust Strust in original attached to trust Strust -------------------------------------------------------------------------------- value label vignum -------------------------------------------------------------------------------- values labels range: [1,12] string length: [19,34] N: 12 unique at full length: yes gaps: no unique at length 10: no missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Depressive Disorder/Majority/Man 2 Depressive Disorder/Majority/Woman 3 Depressive Disorder/Minority/Man 4 Depressive Disorder/Minority/Woman 5 Schizophrenia/Majority/Man 6 Schizophrenia/Majority/Woman 7 Schizophrenia/Minority/Man 8 Schizophrenia/Minority/Woman 9 Asthma/Majority/Man 10 Asthma/Majority/Woman 11 Asthma/Minority/Man 12 Asthma/Minority/Woman in default attached to vignum Svignum in original attached to vignum Svignum -------------------------------------------------------------------------------- value label wrkstat -------------------------------------------------------------------------------- values labels range: [1,10] string length: [7,36] N: 10 unique at full length: yes gaps: no unique at length 10: yes missing .*: 0 null string: no leading/trailing blanks: no numeric -> numeric: no definition 1 Employed-full time 2 Employed-part time 3 Employed-less than part-time 4 Helping family member 5 Unemployed (Laid off, Without a job) 6 Student,school,vocational training 7 Retired 8 Housewife, home duties 9 Permanently disabled 10 Other, not in labour force in default attached to wrkstat Swrkstat in original attached to wrkstat Swrkstat . . // #3 - not in book . // get list of value labels . . * get list of non-S variables . drop S* . unab varlist : _all . . * define local to hold list of label definitions . local valdeflist "" . . * loop through variables and add value label names to valdeflist . foreach varname in `varlist' { 2. local vallabel : value label `varname' 3. local valdeflist "`valdeflist' `vallabel'" 4. } . . * list of value labels with duplicates . display "`valdeflist'" cntry_iu vignum serious Ldummy Ldummy Ldummy Ldummy Ldummy Ldummy Llikely Llik > ely Llikely Ldist Ldist Ldist Ldist Ldist Ldist Llikely Llikely Llikert Lliker > t Llikert Llikert Limport Limport Limport Lrespons Lrespons Lrespons Lrespons > Lrespons Lrespons Lrespons cause puboften pubfright pubsymp trust gender age w > rkstat marital edudeg . . * remove duplicates . local valdeflist : list uniq valdeflist . display "`valdeflist'" cntry_iu vignum serious Ldummy Llikely Ldist Llikert Limport Lrespons cause pubo > ften pubfright pubsymp trust gender age wrkstat marital edudeg . . log close log: D:\wf\work\wf5-sgc4a-vallab-check.log log type: text closed on: 24 Oct 2008, 09:41:09 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-sgc4b-vallab-dump.do . capture log close . log using wf5-sgc4b-vallab-dump, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc4b-vallab-dump.log log type: text opened on: 24 Oct 2008, 09:41:09 . . // program: wf5-sgc4b-vallab-dump.do \ for stata 9 . // task: dump label define commands to be edited . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-sgc03, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-sgc-source.dta \ wf-sgc-support.do jsl 2008-04-03. 2. wf-sgc01.dta \ create clones of source variables \ wf5-sgc2a.do jsl 2008-10-24 3. wf-sgc02.dta \ rename source variables \ wf5-sgc2c.do jsl 2008-10-24. 4. language original uses the original, unrevised labels; language default uses revised labels \ wf5-sgc3b.do jsl 2008-10-24. 5. wf-sgc03.dta \ revised var labels for source & default languages \ wf5-sgc3b.do jsl 2008-10-24. . drop S* . . // #2 . // get list of value labels . . quietly labelbook . local valdeflist = r(names) . . // #3 - approach 1 - easy to produce list but harder to edit . // create label define commands to edit . . label save `valdeflist' using /// > wf5-sgc4b-vallab-labelsave-dummy.doi, replace file wf5-sgc4b-vallab-labelsave-dummy.doi saved . . // #3 - approach 2 - harder to produce list but easier to edit . // create label define commands to edit . . * create a dataset with value labels . uselabel `valdeflist' , clear . . * here is what the uselabel dataset looks like . list in 1/4, clean lname value label trunc 1. Ldist 1 Definitely Willing 0 2. Ldist 2 Probably Willing 0 3. Ldist 3 Probably Unwilling 0 4. Ldist 4 Definitely Unwilling 0 . . * open file to contain label define commands . capture file close myfile . file open myfile using wf5-sgc4b-vallab-labdef-dummy.doi, write replace . . * loop through dataset of value labels and save label define commands . local rownum = 0 // counter for current row . local priorlbl "" // name of prior label that was printed . . while `rownum' <= _N { // loop through all rows of dataset 2. . local ++rownum 3. * retrieve information from current row . local lblnm = lname[`rownum'] // name of value label 4. local lblval = value[`rownum'] // specific value being labeled 5. local lbllbl = label[`rownum'] // name assigned to that value 6. . * get first letter of label to determine if it is a missing value label . local startletter = substr("`lblval'",1,1) 7. . * if name of label has changed, write header . if "`priorlbl'"!="`lblnm'" { 8. file write myfile "//" _col(30) `""1234567890""' _newline 9. } 10. . * only write a label define command if the value is not a missing value . if "`startletter'"!="." { 11. . file write myfile /// > "label define N`lblnm' " _col(25) "`lblval'" /// > _col(30) `""`lbllbl'""' ", modify" _newline 12. } 13. . * before starting with a new label, the prior label becomes the current la > bel . local priorlbl "`lblnm'" 14. . } . . file close myfile . . // #4 . // create label value commands . . * reload data and get list of non-source variables . use wf-sgc03, clear (Workflow data for SGC renaming example \ 2008-10-24) . drop S* . aorder . unab varlist : _all . . * open file to contain label value commands . file open myfile using wf5-sgc4b-vallab-labval-dummy.doi, write replace . . * loop through variable list and create label value commands . foreach varname in `varlist' { 2. . * get the label assigned to the current variable . local lblnm : value label `varname' 3. . * if a label is defined, write a label value command . if "`lblnm'"!="" { 4. file write myfile /// > "label value `varname'" _col(27) "N`lblnm'" _newline 5. } 6. } . . file close myfile . . log close log: D:\wf\work\wf5-sgc4b-vallab-dump.log log type: text closed on: 24 Oct 2008, 09:41:09 -------------------------------------------------------------------------------- . exit end of do-file . do wf5-sgc4c-vallab-revise.do . capture log close . log using wf5-sgc4c-vallab-revise, replace text -------------------------------------------------------------------------------- log: D:\wf\work\wf5-sgc4c-vallab-revise.log log type: text opened on: 24 Oct 2008, 09:41:09 . . // program: wf5-sgc4c-vallab-revise.do \ for stata 9 . // include: requires wf5-sgc4b-vallab-labdef-revised.doi . // & wf5-sgc4b-vallab-labval-revised.doi . // task: step 4c: create new value labels . // project: workflow chapter 5 - sgc renaming and relabeling example . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define local . . local date "2008-10-24" . local tag "wf5-sgc4c.do jsl `date'." . . // #2 . // load data . . use wf-sgc03, clear (Workflow data for SGC renaming example \ 2008-10-24) . * in stata 10 and later: datasignature confirm . . // #3 . // create new value label definitions and assign labels . . include wf5-sgc4b-vallab-labdef-revised.doi . // include: wf5-sgc4b-vallab-labdef-revised.doi . // used by: wf5-sgc4c-vallab-revise.do \ for stata 9 . // task: revised label definitions for SGC . // project: workflow chapter 5 . // author: scott long \ 2008-10-24 . . // Note: new labels are named with N at start of name . . // 1234567890 . label define NLdist 1 "1Definite", modify . label define NLdist 2 "2Probably", modify . label define NLdist 3 "3ProbNot", modify . label define NLdist 4 "4DefNot", modify . . // 1234567890 . label define NLdummy 1 "1_Yes", modify . label define NLdummy 2 "2_No", modify . . // 1234567890 . label define NLimport 1 "1NotAtAll", modify . label define NLimport 10 "10Very", modify . . // 1234567890 . label define NLlikely 1 "1V_Likely", modify . label define NLlikely 2 "2Somewhat", modify . label define NLlikely 3 "3NotVery", modify . label define NLlikely 4 "4NotAtAll", modify . . // 1234567890 . label define NLlikert 1 "1StAgree", modify . label define NLlikert 2 "2Agree", modify . label define NLlikert 3 "3Disagree", modify . label define NLlikert 4 "4StDisagr", modify . . // 1234567890 . label define NLrespons 1 "1Definite", modify . label define NLrespons 2 "2Probably", modify . label define NLrespons 3 "3ProbNot", modify . label define NLrespons 4 "4DefNot", modify . . // 1234567890 . label define Nage 18 "18 years", modify . label define Nage 97 ">=97", modify . . // 1234567890 . label define Ncause 1 "1Depres", modify . label define Ncause 2 "2Asthma", modify . label define Ncause 3 "3Schizo", modify . label define Ncause 4 "4Stress", modify . label define Ncause 5 "5Other", modify . label define Ncause 7 "7Multipl", modify . . // 1234567890 . label define Ncntry_iu 11 "Argentina", modify . label define Ncntry_iu 12 "Bangladesh", modify . label define Ncntry_iu 13 "Brazil", modify . label define Ncntry_iu 14 "Bulgaria", modify . label define Ncntry_iu 15 "Cyprus", modify . label define Ncntry_iu 16 "Finland", modify . label define Ncntry_iu 17 "Germany", modify . label define Ncntry_iu 18 "Hungary", modify . label define Ncntry_iu 19 "Iceland", modify . label define Ncntry_iu 20 "Japan", modify . label define Ncntry_iu 21 "Korea", modify . label define Ncntry_iu 22 "Nepal", modify . label define Ncntry_iu 23 "NewZealand", modify . label define Ncntry_iu 24 "Philippines", modify . label define Ncntry_iu 25 "SouthAfrica", modify . label define Ncntry_iu 26 "Spain", modify . label define Ncntry_iu 27 "UK", modify . label define Ncntry_iu 28 "USA", modify . . // 1234567890 . label define Nedudeg 0 "0NoFormal", modify . label define Nedudeg 1 "1Lowest", modify . label define Nedudeg 2 "2AboveLow", modify . label define Nedudeg 3 "3Second", modify . label define Nedudeg 4 "4Above2nd", modify . . // 1234567890 . label define Ngender 1 "1_Male", modify . label define Ngender 2 "2_Female", modify . . // 1234567890 . label define Nmarital 1 "1Married", modify . label define Nmarital 2 "2Widowed", modify . label define Nmarital 3 "3Divorced", modify . label define Nmarital 4 "4Separatd", modify . label define Nmarital 5 "5Cohabit", modify . label define Nmarital 6 "6Single", modify . . // 1234567890 . label define Npubfright 1 "1Very", modify . label define Npubfright 2 "2Some", modify . label define Npubfright 3 "3NotVery", modify . label define Npubfright 4 "4NotAtAll", modify . . // 1234567890 . label define Npuboften 1 "1Frequent", modify . label define Npuboften 2 "2Occasion", modify . label define Npuboften 3 "3Rarely", modify . label define Npuboften 4 "4Never", modify . . // 1234567890 . label define Npubsymp 1 "1None", modify . label define Npubsymp 2 "2Little", modify . label define Npubsymp 3 "3QuiteBit", modify . label define Npubsymp 4 "4Great", modify . . // 1234567890 . label define Nserious 0 "NAP", modify . label define Nserious 1 "1VSerious", modify . label define Nserious 2 "2Moderate", modify . label define Nserious 3 "3NotVery", modify . label define Nserious 4 "4NotAtAll", modify . . // 1234567890 . label define Ntrust 1 "1Trust", modify . label define Ntrust 2 "2Careful", modify . . // 1234567890 . label define Nvignum 1 "1DepMIn", modify . label define Nvignum 2 "2DepFIn", modify . label define Nvignum 3 "3DepMOut", modify . label define Nvignum 4 "4DepFOut", modify . label define Nvignum 5 "5SchMIn", modify . label define Nvignum 6 "6SchFIn", modify . label define Nvignum 7 "7SchMOut", modify . label define Nvignum 8 "8SchFOut", modify . label define Nvignum 9 "9AstMIn", modify . label define Nvignum 10 "10AsFIn", modify . label define Nvignum 11 "11AstMOut", modify . label define Nvignum 12 "12AstFOut", modify . . // 1234567890 . label define Nwrkstat 1 "1FullTime", modify . label define Nwrkstat 2 "2PartTime", modify . label define Nwrkstat 3 "3 tabulation of art # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- 0 | 85 20.83 20.83 1 | 102 25.00 45.83 2 | 72 17.65 63.48 3 | 49 12.01 75.49 4 | 45 11.03 86.52 5 | 25 6.13 92.65 6 | 13 3.19 95.83 7 | 9 2.21 98.04 8 | 2 0.49 98.53 9 | 1 0.25 98.77 10 | 2 0.49 99.26 12 | 1 0.25 99.51 15 | 1 0.25 99.75 18 | 1 0.25 100.00 ------------+----------------------------------- Total | 408 100.00 . tab1 fem fel, missing -> tabulation of fem Gender: | 1=female | 0=male | Freq. Percent Cum. ------------+----------------------------------- 0_Male | 249 61.03 61.03 1_Female | 159 38.97 100.00 ------------+----------------------------------- Total | 408 100.00 -> tabulation of fel Fellow: | 1=yes 0=no | Freq. Percent Cum. ------------+----------------------------------- 0_NotFellow | 156 38.24 38.24 1_Fellow | 252 61.76 100.00 ------------+----------------------------------- Total | 408 100.00 . . // #3b - stem for each variable with a loop . . foreach var in art cit phd job ment { 2. stem `var' 3. } Stem-and-leaf plot for art (# of articles published) 0* | 000000000000000000000000000000000000000000000000000000000000000 ... (85) 0* | 11111111111111111111111111111111111111111111111111111111111111 ... (102) 0* | 222222222222222222222222222222222222222222222222222222222222222222222222 0* | 3333333333333333333333333333333333333333333333333 0* | 444444444444444444444444444444444444444444444 0* | 5555555555555555555555555 0* | 6666666666666 0* | 777777777 0* | 88 0* | 9 1* | 00 1* | 1* | 2 1* | 1* | 1* | 5 1* | 1* | 1* | 8 Stem-and-leaf plot for cit (# of citations received) 0* | 0000000000000000000000000000000000000000000000000000000000000 ... (212) 1* | 0000001111222222333333333444444555555567788888888899999 2* | 000001111122223333334444445566666777789 3* | 0000122222333444556667789 4* | 122335566777788888889 5* | 1144556677789 6* | 0034555556667 7* | 01145778 8* | 012368 9* | 10* | 0057 11* | 3 12* | 03 13* | 14* | 069 15* | 4 16* | 39 17* | 18* | 19* | 20* | 113 Stem-and-leaf plot for phd (PhD prestige) phd rounded to nearest multiple of .01 plot in units of .01 10* | 00 11* | 8888 12* | 2255588 13* | 14* | 00002 15* | 23 16* | 3337888 17* | 4455666 18* | 00011116 19* | 014455 20* | 00055 21* | 000000000112222234558 22* | 011134566666 23* | 226666699999999 24* | 3 25* | 00000111222444555666666668888 26* | 1 27* | 222226666 28* | 33366666666677777888 29* | 666666 30* | 4444449 31* | 5555999 32* | 000000000000 33* | 266666666666666666666666 34* | 00000112777 35* | 222222222222222222222224444999999999 36* | 22888888889999 37* | 55555 38* | 4444444444445 39* | 22 40* | 000000000 41* | 44466666666 42* | 59999999999999999999999999999999999999 43* | 2222222224444 44* | 8888888 45* | 4444444444444 46* | 222222444444444444 47* | 48* | 0 Stem-and-leaf plot for job (Prestige of first job) job rounded to nearest multiple of .01 plot in units of .01 10* | 00000000000000000000000000000000000000000000000000000000000000 ... (99) 11* | 00 12* | 0029 13* | 2 14* | 000004777788 15* | 033333368 16* | 022333588 17* | 002459 18* | 0334444446899 19* | 00001444444444456789 20* | 0223 21* | 0000011111111222335555666 22* | 000234488 23* | 0000256666799 24* | 0000177779 25* | 00122222266666666666666666666666 26* | 27* | 2222222222222222222 28* | 88888888888 29* | 30* | 444444444444444444444444 31* | 32* | 00000000000000000 33* | 666666666666666666 34* | 35* | 222222222222222222 36* | 888888888888 37* | 38* | 4444444 39* | 40* | 000 41* | 6 42* | 43* | 22 44* | 888 45* | 46* | 4444 47* | 48* | 0 Stem-and-leaf plot for ment (Citations received by mentor) ment rounded to integers 0* | 0000000000000000000000000000000000000000000000000111111111111 ... (138) 1* | 00000000111111122222222223333334444444445555666667777777888888899999999 2* | 000011122333445666667777888999 3* | 00001111112233335666777788999999 4* | 1111134444566667889 5* | 000022223445555677999 6* | 122233445556678 7* | 033468 8* | 0034455667888999 9* | 011245 10* | 3589 11* | 03466 12* | 2346 13* | 777 14* | 0244467 15* | 059 16* | 39 17* | 668 18* | 19* | 788 20* | 33444 21* | 6 22* | 23* | 22339 24* | 18 25* | 8 26* | 27* | 7 28* | 48 29* | 30* | 4 31* | 32* | 33* | 9 34* | 35* | 36* | 37* | 38* | 39* | 40* | 41* | 42* | 43* | 44* | 45* | 46* | 47* | 48* | 49* | 50* | 51* | 52* | 53* | 2 . . // #4 . // standard dotplot with stata 9 graphics . . // #4a - standard dotplot . . dotplot cit . . // #4b - dotplots in stata 9 graphics with loop . . foreach var in art cit phd job ment { 2. dotplot `var' 3. graph export wf6-review-biochem-`var'.eps, replace 4. } (note: file wf6-review-biochem-art.eps not found) (file wf6-review-biochem-art.eps written in EPS format) (note: file wf6-review-biochem-cit.eps not found) (file wf6-review-biochem-cit.eps written in EPS format) (note: file wf6-review-biochem-phd.eps not found) (file wf6-review-biochem-phd.eps written in EPS format) (note: file wf6-review-biochem-job.eps not found) (file wf6-review-biochem-job.eps written in EPS format) (note: file wf6-review-biochem-ment.eps not found) (file wf6-review-biochem-ment.eps written in EPS format) . . // #4c - dotplot in stata 7 graphics . . version 7: dotplot cit . graph export wf6-review-biochem-cit-stata7.eps, replace (note: file wf6-review-biochem-cit-stata7.eps not found) (file wf6-review-biochem-cit-stata7.eps written in EPS format) . . log close log: D:\wf\work\wf6-review-biochem.log log type: text closed on: 24 Oct 2008, 09:41:13 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-gss.do . log using wf6-review-gss, replace text (note: file D:\wf\work\wf6-review-gss.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-gss.log log type: text opened on: 24 Oct 2008, 09:41:13 . . // program: wf6-review-gss.do \ for stata 9 . // task: Review of GSS data for v4 . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load dataset . . use wf-gsswarm, clear (Workflow data from 2002 GSS on women and work \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // check distribution of v4 . . * with value labels . tabulate v4, miss Workg mom: warm relation | child ok | Freq. Percent Cum. ---------------------------+----------------------------------- Strongly agree | 468 39.97 39.97 Agree | 383 32.71 72.67 Neither agree nor disagree | 124 10.59 83.26 Strongly disagree | 184 15.71 98.98 Cant choose | 11 0.94 99.91 Na, refused | 1 0.09 100.00 ---------------------------+----------------------------------- Total | 1,171 100.00 . . * without value labels . tabulate v4, nolab Workg mom: | warm | relation | child ok | Freq. Percent Cum. ------------+----------------------------------- 1 | 468 39.97 39.97 2 | 383 32.71 72.67 3 | 124 10.59 83.26 5 | 184 15.71 98.98 8 | 11 0.94 99.91 9 | 1 0.09 100.00 ------------+----------------------------------- Total | 1,171 100.00 . . log close log: D:\wf\work\wf6-review-gss.log log type: text closed on: 24 Oct 2008, 09:41:13 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-timetophd.do . capture log close . log using wf6-review-timetophd, replace text (note: file D:\wf\work\wf6-review-timetophd.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-timetophd.log log type: text opened on: 24 Oct 2008, 09:41:13 . . // program: wf6-review-timetophd.do \ for stata 9 . // task: simulating the effects of mis-labeled enrolled time . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // note: data was simulated to illustrate propoerties of real data . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acpub, replace (Workflow data on scientific productivity \ 2008-04-04) . * in stata 10 and later: datasignature confirm . . // #2 . // estimate model with supposedly correct data . . nbreg pub enrol phd female, nolog irr Negative binomial regression Number of obs = 278 LR chi2(3) = 23.54 Dispersion = mean Prob > chi2 = 0.0000 Log likelihood = -606.28466 Pseudo R2 = 0.0190 ------------------------------------------------------------------------------ pub | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- enrol | 1.056071 .0156467 3.68 0.000 1.025845 1.087188 phd | 1.103679 .0654233 1.66 0.096 .9826206 1.239652 female | .7533 .0968775 -2.20 0.028 .5854637 .9692504 -------------+---------------------------------------------------------------- /lnalpha | -.4592692 .1471825 -.7477416 -.1707969 -------------+---------------------------------------------------------------- alpha | .6317451 .0929818 .4734346 .8429928 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 172.26 Prob>=chibar2 = 0.000 . . // #3 . // check enrol . . tabulate enrol Elapsed | time from | BS to PhD | Freq. Percent Cum. ------------+----------------------------------- 3 | 5 1.80 1.80 4 | 63 22.66 24.46 5 | 82 29.50 53.96 6 | 57 20.50 74.46 7 | 39 14.03 88.49 8 | 17 6.12 94.60 9 | 8 2.88 97.48 10 | 1 0.36 97.84 14 | 1 0.36 98.20 24 | 2 0.72 98.92 25 | 3 1.08 100.00 ------------+----------------------------------- Total | 278 100.00 . pwcorr pub enrol | pub enrol -------------+------------------ pub | 1.0000 enrol | 0.3797 1.0000 . . // #4 . // estimate model with corrected variable for enrolled time . . nbreg pub enrol_fixed phd female, nolog irr Negative binomial regression Number of obs = 278 LR chi2(3) = 26.97 Dispersion = mean Prob > chi2 = 0.0000 Log likelihood = -604.5674 Pseudo R2 = 0.0218 ------------------------------------------------------------------------------ pub | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- enrol_fixed | .82013 .037127 -4.38 0.000 .7504973 .8962233 phd | 1.112075 .0666021 1.77 0.076 .9889072 1.250582 female | .7450266 .0964034 -2.27 0.023 .5781357 .960094 -------------+---------------------------------------------------------------- /lnalpha | -.4493616 .1428516 -.7293456 -.1693777 -------------+---------------------------------------------------------------- alpha | .6380353 .0911444 .4822244 .84419 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 211.39 Prob>=chibar2 = 0.000 . . // #4 . // compare distributions . . sum enrol_* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- enrol_fixed | 278 5.564748 1.467253 3 14 . label var enrol_fixed "enroll_fixed: enrolled time" . label var enrol "enrol: elapsed time" . dotplot enrol_fixed enrol, /// > ytitle("Years",size(medium)) xlabel(,labsize(medium)) . graph export wf6-review-timetophd.eps, replace (note: file wf6-review-timetophd.eps not found) (file wf6-review-timetophd.eps written in EPS format) . . log close log: D:\wf\work\wf6-review-timetophd.log log type: text closed on: 24 Oct 2008, 09:41:15 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-phdspike.do . capture log close . log using wf6-review-phdspike, replace text (note: file D:\wf\work\wf6-review-phdspike.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-phdspike.log log type: text opened on: 24 Oct 2008, 09:41:15 . . // program: wf6-review-phdspike.do \ for stata 9 . // task: Substantive review of PhD prestige . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // check descriptives . . summarize phd Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- phd | 408 3.200564 .9537509 1 4.8 . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- job 408 80 2.233431 1 4.8 Prestige of first job fem 408 2 .3897059 0 1 Gender: 1=female 0=male phd 408 89 3.200564 1 4.8 PhD prestige ment 408 123 45.47058 0 531.9999 Citations received by mentor fel 408 2 .6176471 0 1 Fellow: 1=yes 0=no art 408 14 2.276961 0 18 # of articles published cit 408 87 21.71569 0 203 # of citations received -------------------------------------------------------------------------------- . . // #3 . // check specific values . . stem phd Stem-and-leaf plot for phd (PhD prestige) phd rounded to nearest multiple of .01 plot in units of .01 10* | 00 11* | 8888 12* | 2255588 13* | 14* | 00002 15* | 23 16* | 3337888 17* | 4455666 18* | 00011116 19* | 014455 20* | 00055 21* | 000000000112222234558 22* | 011134566666 23* | 226666699999999 24* | 3 25* | 00000111222444555666666668888 26* | 1 27* | 222226666 28* | 33366666666677777888 29* | 666666 30* | 4444449 31* | 5555999 32* | 000000000000 33* | 266666666666666666666666 34* | 00000112777 35* | 222222222222222222222224444999999999 36* | 22888888889999 37* | 55555 38* | 4444444444445 39* | 22 40* | 000000000 41* | 44466666666 42* | 59999999999999999999999999999999999999 43* | 2222222224444 44* | 8888888 45* | 4444444444444 46* | 222222444444444444 47* | 48* | 0 . dotplot phd . graph export wf6-review-phdspike.eps, replace (note: file wf6-review-phdspike.eps not found) (file wf6-review-phdspike.eps written in EPS format) . . // #4 . // check phd prestige to explain spike . . tab1 phd if phd>4 & phd<4.5 -> tabulation of phd if phd>4 & phd<4.5 PhD | prestige | Freq. Percent Cum. ------------+----------------------------------- 4.14 | 3 4.35 4.35 4.16 | 8 11.59 15.94 4.25 | 1 1.45 17.39 4.29 | 37 53.62 71.01 4.32 | 9 13.04 84.06 4.34 | 4 5.80 89.86 4.48 | 7 10.14 100.00 ------------+----------------------------------- Total | 69 100.00 . . log close log: D:\wf\work\wf6-review-phdspike.log log type: text closed on: 24 Oct 2008, 09:41:16 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-jobphd.do . capture log close . log using wf6-review-jobphd, replace text (note: file D:\wf\work\wf6-review-jobphd.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-jobphd.log log type: text opened on: 24 Oct 2008, 09:41:16 . . // program: wf6-review-jobphd.do \ for stata 9 . // task: Looking at pairs of variables . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // check range of values for phd and job . . codebook phd job, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- phd 408 89 3.200564 1 4.8 PhD prestige job 408 80 2.233431 1 4.8 Prestige of first job -------------------------------------------------------------------------------- . . // #3 . // compare job and phd histograms . . label var phd "phd: PhD prestige" . label var job "job: Prestige of first job" . dotplot phd job, /// > xlabel(,labsize(medium)) . graph export wf6-review-jobphd-phdjob-hist.eps, replace (note: file wf6-review-jobphd-phdjob-hist.eps not found) (file wf6-review-jobphd-phdjob-hist.eps written in EPS format) . . // #4 . // simple scatter plot . . scatter job phd . graph export wf6-review-jobphd-phdjob.eps, replace (note: file wf6-review-jobphd-phdjob.eps not found) (file wf6-review-jobphd-phdjob.eps written in EPS format) . . // #5 . // scatter plot spruced up . . scatter job phd, msymbol(circle_hollow) /// > ylabel(, grid) xlabel(, grid) aspectratio(1) . graph export wf6-review-jobphd-phdjob-nice.eps, replace (note: file wf6-review-jobphd-phdjob-nice.eps not found) (file wf6-review-jobphd-phdjob-nice.eps written in EPS format) . . // #6 . // scatter plot spruced up with jitter . . scatter job phd, msymbol(circle_hollow) jitter(8) /// > ylabel(, grid) xlabel(, grid) aspectratio(1) . graph export wf6-review-jobphd-phdjob-jitter.eps, replace (note: file wf6-review-jobphd-phdjob-jitter.eps not found) (file wf6-review-jobphd-phdjob-jitter.eps written in EPS format) . . // #7 . // all pairs . . // #7a - scatter plot matrix . . graph matrix phd job ment art cit fem fel, /// > jitter(3) half msymbol(circle_hollow) . graph export wf6-review-jobphd-matrix.eps, replace (note: file wf6-review-jobphd-matrix.eps not found) (file wf6-review-jobphd-matrix.eps written in EPS format) . . // #7b - individual bivariate scatter plots . . local varlist "job phd ment art cit fem fel" . local nvars : word count `varlist' . . forvalues y_varnum = 1/`nvars' { 2. * retrieve the name of variable for y axis . local y_var : word `y_varnum' of `varlist' 3. * get the variable label . local y_lbl : variable label `y_var' 4. * create label with var name and label combined . label var `y_var' "`y_var': `y_lbl'" 5. * loop through x variables . local x_start = `y_varnum' + 1 6. forvalues x_varnum = `x_start'/`nvars' { 7. * create var labels for x variables . local x_var : word `x_varnum' of `varlist' 8. local x_lbl : variable label `x_var' 9. label var `x_var' "`x_var': `x_lbl'" 10. * create graph . scatter `y_var' `x_var', msymbol(circle_hollow) jitter(8) /// > ylabel(, grid) xlabel(, grid) aspectratio(1) 11. graph export wf6-review-jobphd-`y_var'-`x_var'.eps, replace 12. * reset variable label for x-var . label var `x_var' "`x_lbl'" 13. } 14. } (note: file wf6-review-jobphd-job-phd.eps not found) (file wf6-review-jobphd-job-phd.eps written in EPS format) (note: file wf6-review-jobphd-job-ment.eps not found) (file wf6-review-jobphd-job-ment.eps written in EPS format) (note: file wf6-review-jobphd-job-art.eps not found) (file wf6-review-jobphd-job-art.eps written in EPS format) (note: file wf6-review-jobphd-job-cit.eps not found) (file wf6-review-jobphd-job-cit.eps written in EPS format) (note: file wf6-review-jobphd-job-fem.eps not found) (file wf6-review-jobphd-job-fem.eps written in EPS format) (note: file wf6-review-jobphd-job-fel.eps not found) (file wf6-review-jobphd-job-fel.eps written in EPS format) (note: file wf6-review-jobphd-phd-ment.eps not found) (file wf6-review-jobphd-phd-ment.eps written in EPS format) (note: file wf6-review-jobphd-phd-art.eps not found) (file wf6-review-jobphd-phd-art.eps written in EPS format) (note: file wf6-review-jobphd-phd-cit.eps not found) (file wf6-review-jobphd-phd-cit.eps written in EPS format) (note: file wf6-review-jobphd-phd-fem.eps not found) (file wf6-review-jobphd-phd-fem.eps written in EPS format) (note: file wf6-review-jobphd-phd-fel.eps not found) (file wf6-review-jobphd-phd-fel.eps written in EPS format) (note: file wf6-review-jobphd-ment-art.eps not found) (file wf6-review-jobphd-ment-art.eps written in EPS format) (note: file wf6-review-jobphd-ment-cit.eps not found) (file wf6-review-jobphd-ment-cit.eps written in EPS format) (note: file wf6-review-jobphd-ment-fem.eps not found) (file wf6-review-jobphd-ment-fem.eps written in EPS format) (note: file wf6-review-jobphd-ment-fel.eps not found) (file wf6-review-jobphd-ment-fel.eps written in EPS format) (note: file wf6-review-jobphd-art-cit.eps not found) (file wf6-review-jobphd-art-cit.eps written in EPS format) (note: file wf6-review-jobphd-art-fem.eps not found) (file wf6-review-jobphd-art-fem.eps written in EPS format) (note: file wf6-review-jobphd-art-fel.eps not found) (file wf6-review-jobphd-art-fel.eps written in EPS format) (note: file wf6-review-jobphd-cit-fem.eps not found) (file wf6-review-jobphd-cit-fem.eps written in EPS format) (note: file wf6-review-jobphd-cit-fel.eps not found) (file wf6-review-jobphd-cit-fel.eps written in EPS format) (note: file wf6-review-jobphd-fem-fel.eps not found) (file wf6-review-jobphd-fem-fel.eps written in EPS format) . . log close log: D:\wf\work\wf6-review-jobphd.log log type: text closed on: 24 Oct 2008, 09:41:34 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-missing.do . capture log close . log using wf6-review-missing, replace text (note: file D:\wf\work\wf6-review-missing.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-missing.log log type: text opened on: 24 Oct 2008, 09:41:34 . . // program: wf6-review-missing.do \ for stata 9 . // task: Operations with missing values . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-missing, clear (Workflow data to illustrate missing values review \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // look at the distribution of articles . . tabulate art, missing # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- 0 | 102 41.98 41.98 1 | 72 29.63 71.60 2 | 25 10.29 81.89 3 | 13 5.35 87.24 4 | 9 3.70 90.95 9 | 1 0.41 91.36 12 | 1 0.41 91.77 15 | 1 0.41 92.18 . | 19 7.82 100.00 ------------+----------------------------------- Total | 243 100.00 . tabulate art # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- 0 | 102 45.54 45.54 1 | 72 32.14 77.68 2 | 25 11.16 88.84 3 | 13 5.80 94.64 4 | 9 4.02 98.66 9 | 1 0.45 99.11 12 | 1 0.45 99.55 15 | 1 0.45 100.00 ------------+----------------------------------- Total | 224 100.00 . summarize art Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art | 224 1.040179 1.694939 0 15 . . // #3 . // recode large numbers to 5 . . generate art_tr5 = art (19 missing values generated) . replace art_tr5 = 5 if art>5 (22 real changes made) . label var art_tr5 "trunc at 5 # of articles published" . summarize art art_tr5 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art | 224 1.040179 1.694939 0 15 art_tr5 | 243 1.263374 1.568207 0 5 . set linesize 100 . tabulate art art_tr5, missing # of | articles | trunc at 5 # of articles published published | 0 1 2 3 4 5 | Total -----------+------------------------------------------------------------------+---------- 0 | 102 0 0 0 0 0 | 102 1 | 0 72 0 0 0 0 | 72 2 | 0 0 25 0 0 0 | 25 3 | 0 0 0 13 0 0 | 13 4 | 0 0 0 0 9 0 | 9 9 | 0 0 0 0 0 1 | 1 12 | 0 0 0 0 0 1 | 1 15 | 0 0 0 0 0 1 | 1 . | 0 0 0 0 0 19 | 19 -----------+------------------------------------------------------------------+---------- Total | 102 72 25 13 9 22 | 243 . . * selecting valid cases with comparisons . generate art_tr5V2 = art (19 missing values generated) . replace art_tr5V2 = 5 if art>5 & art<. (3 real changes made) . label var art_tr5V2 "trunc at 5 # of articles published" . note art_tr5V2: created using art<. . tabulate art_tr5V2, missing trunc at 5 | # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- 0 | 102 41.98 41.98 1 | 72 29.63 71.60 2 | 25 10.29 81.89 3 | 13 5.35 87.24 4 | 9 3.70 90.95 5 | 3 1.23 92.18 . | 19 7.82 100.00 ------------+----------------------------------- Total | 243 100.00 . . * selecting valid cases with the missing function . generate art_tr5V3 = art (19 missing values generated) . replace art_tr5V3 = 5 if art>5 & !missing(art) (3 real changes made) . label var art_tr5V3 "trunc at 5 # of articles published" . note art_tr5V3: created using !missing(art) . tabulate art_tr5V3, missing trunc at 5 | # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- 0 | 102 41.98 41.98 1 | 72 29.63 71.60 2 | 25 10.29 81.89 3 | 13 5.35 87.24 4 | 9 3.70 90.95 5 | 3 1.23 92.18 . | 19 7.82 100.00 ------------+----------------------------------- Total | 243 100.00 . . // #4 . // create a missing value indicator . . generate art_ismiss = missing(art) . label var art_ismiss "art is missing?" . label def Lismiss 0 0_valid 1 1_missing . label val art_ismiss Lismiss . tabulate art art_ismiss, missing # of | articles | art is missing? published | 0_valid 1_missing | Total -----------+----------------------+---------- 0 | 102 0 | 102 1 | 72 0 | 72 2 | 25 0 | 25 3 | 13 0 | 13 4 | 9 0 | 9 9 | 1 0 | 1 12 | 1 0 | 1 15 | 1 0 | 1 . | 0 19 | 19 -----------+----------------------+---------- Total | 224 19 | 243 . . // #5 . // listing only missing values . . tab1 phd if missing(phd), miss -> tabulation of phd if missing(phd) PhD | prestige | Freq. Percent Cum. ------------+----------------------------------- a_NonUS | 7 36.84 36.84 b_Unranked | 12 63.16 100.00 ------------+----------------------------------- Total | 19 100.00 . tab1 phd art cit if missing(phd), missing -> tabulation of phd if missing(phd) PhD | prestige | Freq. Percent Cum. ------------+----------------------------------- a_NonUS | 7 36.84 36.84 b_Unranked | 12 63.16 100.00 ------------+----------------------------------- Total | 19 100.00 -> tabulation of art if missing(phd) # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- . | 19 100.00 100.00 ------------+----------------------------------- Total | 19 100.00 -> tabulation of cit if missing(phd) # of | citations | received | Freq. Percent Cum. ------------+----------------------------------- . | 19 100.00 100.00 ------------+----------------------------------- Total | 19 100.00 . . foreach varname in phd art cit { 2. tab1 `varname' if missing(`varname'), missing 3. } -> tabulation of phd if missing(phd) PhD | prestige | Freq. Percent Cum. ------------+----------------------------------- a_NonUS | 7 36.84 36.84 b_Unranked | 12 63.16 100.00 ------------+----------------------------------- Total | 19 100.00 -> tabulation of art if missing(art) # of | articles | published | Freq. Percent Cum. ------------+----------------------------------- . | 19 100.00 100.00 ------------+----------------------------------- Total | 19 100.00 -> tabulation of cit if missing(cit) # of | citations | received | Freq. Percent Cum. ------------+----------------------------------- . | 19 100.00 100.00 ------------+----------------------------------- Total | 19 100.00 . . log close log: D:\wf\work\wf6-review-missing.log log type: text closed on: 24 Oct 2008, 09:41:34 ---------------------------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-misstype.do . capture log close . log using wf6-review-misstype, replace text (note: file D:\wf\work\wf6-review-misstype.log not found) ---------------------------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-misstype.log log type: text opened on: 24 Oct 2008, 09:41:34 . . // program: wf6-review-misstype.do \ for stata 9 . // include: requires wf6-review-misstype-refused.doi . // & wf6-review-misstype-ifsxrel.doi . // task: recoding and checking missing data . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and define tag . . local tag "wf6-review-misstype.do jsl 2008-10-24." . use wf-misstype, clear (Workflow data to illustrate missing values \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // define missing value codes for V2 variables . . label def missdat /// > .c "c_catskip" /// categorical answer not needed > .d "d_nodebrief" /// declined to be debriefed. > .f "f_femskip" /// not asked since R is female. > .m "m_maleskip" /// not asked since R is male. > .p "p_priorref" /// not asked since R refused prelim question. > .r "r_refused" /// refused to answer question. > .s "s_single" /// not asked since single. > .x "x_nosxrel" /// not asked since no sexual relationships. > .z "z_prior_0" /// not asked since reported 0 on lead-in ques > tion. > . // #3 . // missing values for simple refusals . . clonevar acttvV2 = acttv (1 missing value generated) . tab1 acttvV2, missing -> tabulation of acttvV2 Q1, impt: wtch | TV or movies | Freq. Percent Cum. -----------------+----------------------------------- Not at all impt1 | 12 5.50 5.50 2 | 12 5.50 11.01 3 | 24 11.01 22.02 4 | 26 11.93 33.94 5 | 34 15.60 49.54 6 | 34 15.60 65.14 7 | 30 13.76 78.90 8 | 22 10.09 88.99 9 | 5 2.29 91.28 vry impt10 | 18 8.26 99.54 ref | 1 0.46 100.00 -----------------+----------------------------------- Total | 218 100.00 . tab1 acttvV2, missing nolabel -> tabulation of acttvV2 Q1, impt: | wtch TV or | movies | Freq. Percent Cum. ------------+----------------------------------- 1 | 12 5.50 5.50 2 | 12 5.50 11.01 3 | 24 11.01 22.02 4 | 26 11.93 33.94 5 | 34 15.60 49.54 6 | 34 15.60 65.14 7 | 30 13.76 78.90 8 | 22 10.09 88.99 9 | 5 2.29 91.28 10 | 18 8.26 99.54 .a | 1 0.46 100.00 ------------+----------------------------------- Total | 218 100.00 . recode acttvV2 .a=.r (acttvV2: 1 changes made) . * or you can use: replace acttvV2 = .r if acttvV2==.a . tabulate acttvV2 acttv, missing Q1, impt: wtch | Q1, impt: wtch TV or movies TV or movies | Not at al 2 3 4 | Total -----------------+--------------------------------------------+---------- Not at all impt1 | 12 0 0 0 | 12 2 | 0 12 0 0 | 12 3 | 0 0 24 0 | 24 4 | 0 0 0 26 | 26 5 | 0 0 0 0 | 34 6 | 0 0 0 0 | 34 7 | 0 0 0 0 | 30 8 | 0 0 0 0 | 22 9 | 0 0 0 0 | 5 vry impt10 | 0 0 0 0 | 18 .r | 0 0 0 0 | 1 -----------------+--------------------------------------------+---------- Total | 12 12 24 26 | 218 Q1, impt: wtch | Q1, impt: wtch TV or movies TV or movies | 5 6 7 8 | Total -----------------+--------------------------------------------+---------- Not at all impt1 | 0 0 0 0 | 12 2 | 0 0 0 0 | 12 3 | 0 0 0 0 | 24 4 | 0 0 0 0 | 26 5 | 34 0 0 0 | 34 6 | 0 34 0 0 | 34 7 | 0 0 30 0 | 30 8 | 0 0 0 22 | 22 9 | 0 0 0 0 | 5 vry impt10 | 0 0 0 0 | 18 .r | 0 0 0 0 | 1 -----------------+--------------------------------------------+---------- Total | 34 34 30 22 | 218 Q1, impt: wtch | Q1, impt: wtch TV or movies TV or movies | 9 vry impt1 ref | Total -----------------+---------------------------------+---------- Not at all impt1 | 0 0 0 | 12 2 | 0 0 0 | 12 3 | 0 0 0 | 24 4 | 0 0 0 | 26 5 | 0 0 0 | 34 6 | 0 0 0 | 34 7 | 0 0 0 | 30 8 | 0 0 0 | 22 9 | 5 0 0 | 5 vry impt10 | 0 18 0 | 18 .r | 0 0 1 | 1 -----------------+---------------------------------+---------- Total | 5 18 1 | 218 . label val acttvV2 missdat . . // #4 . // multiple causes of missing data . . * years married . clonevar maryearV2 = maryear (90 missing values generated) . replace maryearV2 = .r if maryear==.a // refused question (3 real changes made, 3 to missing) . replace maryearV2 = .s if married==2 // single (86 real changes made, 86 to missing) . replace maryearV2 = .p if married==.a // married question refused (1 real change made, 1 to missing) . label val maryearV2 missdat . tab1 maryearV2 if !missing(maryearV2) -> tabulation of maryearV2 if !missing(maryearV2) Q22_YRS, | dur cur | marriag yrs | Freq. Percent Cum. ------------+----------------------------------- 0 | 1 0.78 0.78 1 | 3 2.34 3.13 2 | 4 3.13 6.25 3 | 2 1.56 7.81 4 | 2 1.56 9.38 5 | 4 3.13 12.50 6 | 6 4.69 17.19 7 | 6 4.69 21.88 8 | 1 0.78 22.66 9 | 4 3.13 25.78 10 | 2 1.56 27.34 11 | 4 3.13 30.47 12 | 2 1.56 32.03 13 | 8 6.25 38.28 14 | 3 2.34 40.63 15 | 3 2.34 42.97 16 | 2 1.56 44.53 17 | 4 3.13 47.66 18 | 2 1.56 49.22 19 | 3 2.34 51.56 20 | 9 7.03 58.59 21 | 1 0.78 59.38 22 | 2 1.56 60.94 23 | 1 0.78 61.72 24 | 5 3.91 65.63 25 | 1 0.78 66.41 26 | 3 2.34 68.75 28 | 1 0.78 69.53 30 | 1 0.78 70.31 31 | 1 0.78 71.09 32 | 2 1.56 72.66 33 | 2 1.56 74.22 34 | 1 0.78 75.00 35 | 1 0.78 75.78 36 | 4 3.13 78.91 37 | 2 1.56 80.47 38 | 2 1.56 82.03 39 | 1 0.78 82.81 40 | 2 1.56 84.38 41 | 2 1.56 85.94 43 | 1 0.78 86.72 44 | 1 0.78 87.50 45 | 1 0.78 88.28 46 | 2 1.56 89.84 47 | 2 1.56 91.41 48 | 1 0.78 92.19 50 | 3 2.34 94.53 52 | 1 0.78 95.31 53 | 1 0.78 96.09 54 | 2 1.56 97.66 59 | 1 0.78 98.44 60 | 1 0.78 99.22 61 | 1 0.78 100.00 ------------+----------------------------------- Total | 128 100.00 . . * months married . clonevar marmthV2 = marmth (106 missing values generated) . recode marmthV2 .a=.r (marmthV2: 19 changes made) . replace marmthV2 = .s if married==2 // single (86 real changes made, 86 to missing) . replace marmthV2 = .p if married==.a // married question refused (1 real change made, 1 to missing) . label val marmthV2 missdat . tab1 marmthV2 if !missing(marmthV2) -> tabulation of marmthV2 if !missing(marmthV2) Q22_MTHS, | dur cur | marriag mo | Freq. Percent Cum. ------------+----------------------------------- 0 | 2 1.79 1.79 1 | 18 16.07 17.86 2 | 11 9.82 27.68 3 | 10 8.93 36.61 4 | 16 14.29 50.89 5 | 13 11.61 62.50 6 | 11 9.82 72.32 7 | 8 7.14 79.46 8 | 6 5.36 84.82 9 | 4 3.57 88.39 10 | 5 4.46 92.86 11 | 8 7.14 100.00 ------------+----------------------------------- Total | 112 100.00 . . // #5 . // missing values that are not missing . . * years plus months married . generate martotal = (maryearV2*12) + marmthV2 (109 missing values generated) . label var martotal "Total months married" . replace martotal = .s if married==2 // single (86 real changes made, 86 to missing) . replace martotal = .p if married==.a // married question refused (1 real change made, 1 to missing) . replace martotal = .r if marmthV2==.r | maryearV2==.r (22 real changes made, 22 to missing) . label val martotal missdat . tab1 martotal if missing(martotal), missing -> tabulation of martotal if missing(martotal) Total | months | married | Freq. Percent Cum. ------------+----------------------------------- p_priorref | 1 0.92 0.92 r_refused | 22 20.18 21.10 s_single | 86 78.90 100.00 ------------+----------------------------------- Total | 109 100.00 . . * check the refusals . list martotal maryearV2 marmthV2 if martotal==.r, clean martotal maryearV2 marmthV2 12. r_refused 53 r_refused 34. r_refused 31 r_refused 37. r_refused r_refused 11 38. r_refused 33 r_refused 40. r_refused r_refused 8 45. r_refused 54 r_refused 59. r_refused 36 r_refused 69. r_refused 2 r_refused 71. r_refused 13 r_refused 112. r_refused 33 r_refused 131. r_refused 15 r_refused 134. r_refused 17 r_refused 140. r_refused 24 r_refused 145. r_refused 26 r_refused 156. r_refused 6 r_refused 166. r_refused 8 r_refused 173. r_refused 46 r_refused 190. r_refused r_refused 4 198. r_refused 26 r_refused 206. r_refused 24 r_refused 210. r_refused 13 r_refused 214. r_refused 28 r_refused . . * years plus months married - corrected . generate martotalV2 = . (218 missing values generated) . label var martotalV2 "Total months married" . note martotalV2: marmthV2+(12*maryearV2) if both parts answered; /// > marmthV2 if year is missing; maryearV2 if month is missing \ `tag' . . * replace valid year and month if both are nonmissing . replace martotalV2 = (12*maryearV2) + marmthV2 /// > if !missing(maryearV2) & !missing(marmthV2) (109 real changes made) . . * replace with year if only years is valid . replace martotalV2 = 12*maryearV2 /// > if !missing(maryearV2) & marmthV2==.r (19 real changes made) . . * replace month if only month is valid . replace martotalV2 = marmthV2 if maryearV2==.r & !missing(marmthV2) (3 real changes made) . . * add missing codes for single or prior question refusal . replace martotalV2 = .s if married==2 // single (86 real changes made, 86 to missing) . replace martotalV2 = .p if married==.a // married question refused (1 real change made, 1 to missing) . label val martotalV2 missdat . . * check results . tab1 martotalV2 if missing(martotalV2), miss -> tabulation of martotalV2 if missing(martotalV2) Total | months | married | Freq. Percent Cum. ------------+----------------------------------- p_priorref | 1 1.15 1.15 s_single | 86 98.85 100.00 ------------+----------------------------------- Total | 87 100.00 . tab1 maryearV2 if missing(maryearV2), miss -> tabulation of maryearV2 if missing(maryearV2) Q22_YRS, | dur cur | marriag yrs | Freq. Percent Cum. ------------+----------------------------------- p_priorref | 1 1.11 1.11 r_refused | 3 3.33 4.44 s_single | 86 95.56 100.00 ------------+----------------------------------- Total | 90 100.00 . tab1 marmthV2 if missing(marmthV2), miss -> tabulation of marmthV2 if missing(marmthV2) Q22_MTHS, | dur cur | marriag mo | Freq. Percent Cum. ------------+----------------------------------- p_priorref | 1 0.94 0.94 r_refused | 19 17.92 18.87 s_single | 86 81.13 100.00 ------------+----------------------------------- Total | 106 100.00 . . * list all cases - to be certain, you can check all cases . * sort martotalV2 . * list martotalV2 maryearV2 marmthV2, clean . . // #6 . // missing data indicator variables . . clonevar acttvV2M = acttvV2 (1 missing value generated) . *replace acttvV2M = 0 if acttvV2>=0 & acttvV2<=9999999 . replace acttvV2M = 0 if !missing(acttvV2) (217 real changes made) . label var acttvV2M "acttvV2 is missing" . label val acttvV2M missdat . tabulate PPGENDER acttvV2M, exact missing | acttvV2 is missing KN gender | 0 r_refused | Total -----------+----------------------+---------- 1Male | 107 0 | 107 2fem | 110 1 | 111 -----------+----------------------+---------- Total | 217 1 | 218 Fisher's exact = 1.000 1-sided Fisher's exact = 0.509 . . // #7 . // use include files to recode missing values . . * acttalk . local varnm acttalk . include wf6-review-misstype-refused.doi . // include: wf6-review-misstype-refused.doi . // used by: wf6-review-misstype.do \ for stata 9 . // task: recode .a to .r if refused . // project: workflow chapter . // author: scott long \ 2008-10-23 . . clonevar `varnm'V2 = `varnm' . recode `varnm'V2 .a=.r (acttalkV2: 0 changes made) . . * actexer . local varnm actexer . include wf6-review-misstype-refused.doi . // include: wf6-review-misstype-refused.doi . // used by: wf6-review-misstype.do \ for stata 9 . // task: recode .a to .r if refused . // project: workflow chapter . // author: scott long \ 2008-10-23 . . clonevar `varnm'V2 = `varnm' (1 missing value generated) . recode `varnm'V2 .a=.r (actexerV2: 1 changes made) . . * acthby . local varnm acthby . include wf6-review-misstype-refused.doi . // include: wf6-review-misstype-refused.doi . // used by: wf6-review-misstype.do \ for stata 9 . // task: recode .a to .r if refused . // project: workflow chapter . // author: scott long \ 2008-10-23 . . clonevar `varnm'V2 = `varnm' (1 missing value generated) . recode `varnm'V2 .a=.r (acthbyV2: 1 changes made) . . * sxrelin . local varnm sxrelin . include wf6-review-misstype-refused.doi . // include: wf6-review-misstype-refused.doi . // used by: wf6-review-misstype.do \ for stata 9 . // task: recode .a to .r if refused . // project: workflow chapter . // author: scott long \ 2008-10-23 . . clonevar `varnm'V2 = `varnm' (2 missing values generated) . recode `varnm'V2 .a=.r (sxrelinV2: 2 changes made) . . . * sxrel4w . local varnm sxrel4w . include wf6-review-misstype-refused.doi . // include: wf6-review-misstype-refused.doi . // used by: wf6-review-misstype.do \ for stata 9 . // task: recode .a to .r if refused . // project: workflow chapter . // author: scott long \ 2008-10-23 . . clonevar `varnm'V2 = `varnm' (83 missing values generated) . recode `varnm'V2 .a=.r (sxrel4wV2: 0 changes made) . . include wf6-review-misstype-ifsxrel.doi . // include: wf6-review-misstype-ifsxrel.doi . // used by: wf6-review-misstype.do \ for stata 9 . // task: recode .x if no sex rel; .p if sex rel refused . // project: workflow chapter 6 . // author: scott long \ 2008-10-23 . . replace `varnm'V2 = .x if (sxrelinV2==2) // not in sex relationship (81 real changes made, 81 to missing) . replace `varnm'V2 = .p if (sxrelinV2==.r) // refused sxrelin question (2 real changes made, 2 to missing) . . . log close log: D:\wf\work\wf6-review-misstype.log log type: text closed on: 24 Oct 2008, 09:41:35 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-review-consistent.do . capture log close . log using wf6-review-consistent, replace text (note: file D:\wf\work\wf6-review-consistent.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-review-consistent.log log type: text opened on: 24 Oct 2008, 09:41:35 . . // program: wf6-review-consistent.do \ for stata 9 . // task: Check consistency in science data . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-acjob, clear (Workflow data on academic biochemists \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // check citations and publications . . * is no articles, there should be no citations . assert cit==0 if art==0 . . * if there are not citations, there might be articles . * assert art==0 if cit==0 // this would halt execution . assert art==0 if cit==0, rc0 // will not halt execution 21 contradictions in 106 observations assertion is false . . * check citations when articles are 0 . tabulate cit if art==0, miss # of | citations | received | Freq. Percent Cum. ------------+----------------------------------- 0 | 85 100.00 100.00 ------------+----------------------------------- Total | 85 100.00 . . // #3 . // check if job is more prestigious than doctorate . . * how do distributions compare? . compare job phd ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ jobphd 72 .01 .3709723 2.08 ---------- jointly defined 408 -3.64 -.9671323 2.08 ---------- total 408 . . * look at difference between phd and job . generate job_phd = job - phd . label var job_phd "job-phd: >0 if better job" . . * crude comparisons . inspect job_phd job_phd: job-phd: >0 if better job Number of Observations ----------------------------------- ------------------------------ Total Integers Nonintegers | # Negative 288 3 285 | # # Zero 48 48 - | # # # Positive 72 - 72 | # # # ----- ----- ----- | # # # # Total 408 51 357 | # # # # . Missing - +---------------------- ----- -3.64 2.08 408 (More than 99 unique values) . . * list large differences . sort job_phd . list job_phd art ment fem cit fel job phd if job_phd>.6, clean job_phd art ment fem cit fel job phd 398. .6000001 0 9 1_Female 0 0_NotFellow 2.72 2.12 399. .6200001 3 6 1_Female 15 1_Fellow 2.88 2.26 400. .6500001 1 36 1_Female 18 0_NotFellow 3.52 2.87 401. .74 2 6 0_Male 19 1_Fellow 2.49 1.75 402. .8200002 0 20 1_Female 0 0_NotFellow 3.68 2.86 403. .8899999 0 9 0_Male 0 0_NotFellow 3.04 2.15 404. .8900001 0 20 0_Male 0 1_Fellow 4.48 3.59 405. 1.07 4 233 0_Male 22 1_Fellow 2.88 1.81 406. 1.13 0 0 1_Female 0 0_NotFellow 3.52 2.39 407. 1.17 4 69.99999 0_Male 41 1_Fellow 3.68 2.51 408. 2.08 1 3.999999 1_Female 32 1_Fellow 3.36 1.28 . list job_phd art ment fem cit fel job phd if job_phd<-2, clean job_phd art ment fem cit fel job phd 1. -3.64 3 16 0_Male 24 1_Fellow 1 4.64 2. -3.64 0 2 1_Female 0 1_Fellow 1 4.64 3. -3.62 0 87.99999 0_Male 0 1_Fellow 1 4.62 4. -3.54 5 47.00001 0_Male 27 0_NotFellow 1 4.54 5. -3.54 1 23 1_Female 5 0_NotFellow 1 4.54 6. -3.29 0 18 0_Male 0 0_NotFellow 1 4.29 7. -3.29 1 18 0_Male 0 0_NotFellow 1 4.29 8. -3.29 2 32.00001 0_Male 7 1_Fellow 1 4.29 9. -3.29 0 .9999999 0_Male 0 0_NotFellow 1 4.29 10. -3.29 0 12 1_Female 0 0_NotFellow 1 4.29 11. -3.29 2 204 1_Female 9 0_NotFellow 1 4.29 12. -3.29 3 198 1_Female 26 0_NotFellow 1 4.29 13. -3.29 1 18 1_Female 1 0_NotFellow 1 4.29 14. -3.16 3 66 1_Female 80 1_Fellow 1 4.16 15. -3.14 0 7.999998 0_Male 0 0_NotFellow 1 4.14 16. -3 5 30 0_Male 32 1_Fellow 1 4 17. -3 3 3.999999 1_Female 56 1_Fellow 1 4 18. -2.89 1 54.99999 0_Male 6 1_Fellow 1.65 4.54 19. -2.89 1 45.99999 0_Male 2 1_Fellow 1.4 4.29 20. -2.85 2 248 1_Female 1 1_Fellow 1 3.85 21. -2.84 3 72.99999 1_Female 3 1_Fellow 1 3.84 22. -2.84 1 14 0_Male 0 1_Fellow 1 3.84 23. -2.84 1 204 1_Female 5 1_Fellow 1 3.84 24. -2.79 0 9 0_Male 0 0_NotFellow 1.5 4.29 25. -2.76 5 11 0_Male 82 1_Fellow 1.53 4.29 26. -2.75 0 26 1_Female 0 0_NotFellow 1 3.75 27. -2.75 0 13 1_Female 0 0_NotFellow 1 3.75 28. -2.75 0 108 1_Female 0 0_NotFellow 1 3.75 29. -2.75 0 24 1_Female 0 1_Fellow 1 3.75 30. -2.75 0 12 1_Female 0 0_NotFellow 1 3.75 31. -2.71 1 0 0_Male 6 0_NotFellow 1.58 4.29 32. -2.68 2 36 1_Female 0 1_Fellow 1 3.68 33. -2.68 1 85.99998 1_Female 13 1_Fellow 1 3.68 34. -2.67 0 3.999999 1_Female 0 0_NotFellow 1.62 4.29 35. -2.67 4 29 1_Female 57 0_NotFellow 1.62 4.29 36. -2.66 1 49.99998 1_Female 1 0_NotFellow 1.63 4.29 37. -2.66 12 144 0_Male 100 0_NotFellow 1.63 4.29 38. -2.64 1 232 1_Female 2 0_NotFellow 1.9 4.54 39. -2.64 2 288 1_Female 11 1_Fellow 2 4.64 40. -2.6 5 2 0_Male 37 1_Fellow 1.94 4.54 41. -2.59 7 2 0_Male 21 0_NotFellow 1 3.59 42. -2.59 2 64.99998 1_Female 24 1_Fellow 1 3.59 43. -2.59 1 .9999999 1_Female 2 1_Fellow 1 3.59 44. -2.59 2 105 0_Male 17 1_Fellow 1 3.59 45. -2.54 0 12 1_Female 0 0_NotFellow 1 3.54 46. -2.52 0 32.00001 1_Female 0 1_Fellow 1 3.52 47. -2.52 3 49.99998 1_Female 18 1_Fellow 1 3.52 48. -2.52 2 303.9999 1_Female 60 1_Fellow 2.12 4.64 49. -2.51 2 51.99999 0_Male 37 0_NotFellow 1.83 4.34 50. -2.48 9 14 0_Male 66 1_Fellow 1.84 4.32 51. -2.45 0 19 1_Female 0 0_NotFellow 1.84 4.29 52. -2.4 4 88.99998 0_Male 43 1_Fellow 1.89 4.29 53. -2.36 1 17 1_Female 1 1_Fellow 1 3.36 54. -2.36 4 .9999999 0_Male 65 1_Fellow 1 3.36 55. -2.36 2 3 1_Female 2 1_Fellow 1 3.36 56. -2.36 3 21 0_Male 19 1_Fellow 1 3.36 57. -2.35 0 4.999998 0_Male 0 0_NotFellow 1.94 4.29 58. -2.33 3 72.99999 1_Female 51 1_Fellow 2.15 4.48 59. -2.22 6 116 0_Male 26 0_NotFellow 2.12 4.34 60. -2.2 1 25 0_Male 2 1_Fellow 1 3.2 61. -2.19 0 0 1_Female 0 0_NotFellow 1 3.19 62. -2.18 3 35.00001 0_Male 5 1_Fellow 2.3 4.48 63. -2.16 6 137 0_Male 75 1_Fellow 1.68 3.84 64. -2.16 2 14 0_Male 6 0_NotFellow 2.13 4.29 65. -2.16 1 241 1_Female 0 1_Fellow 1.84 4 66. -2.15 1 27 0_Male 0 0_NotFellow 1 3.15 67. -2.15 0 17 0_Male 0 0_NotFellow 1 3.15 68. -2.14 0 54.99999 1_Female 0 1_Fellow 2.2 4.34 69. -2.12 2 .9999999 1_Female 4 1_Fellow 1.4 3.52 70. -2.1 0 137 0_Male 0 1_Fellow 2.52 4.62 71. -2.09 0 0 1_Female 0 0_NotFellow 1 3.09 72. -2.05 3 0 0_Male 12 1_Fellow 1.63 3.68 73. -2.05 4 57 0_Male 24 1_Fellow 2.11 4.16 74. -2.04 2 .9999999 0_Male 12 1_Fellow 1 3.04 75. -2.01 1 63 0_Male 5 1_Fellow 2.47 4.48 76. -2.01 0 36 1_Female 0 1_Fellow 2.28 4.29 . . * aside: you can round the differences so that fewer decimal . * digits clutter the output . generate job_phdV2 = round(job - phd,.1) . sort job_phdV2 . label var job_phdV2 "job - phd with rounding" . list job_phdV2 `varlist' if job_phd>.5, clean job_ph~2 393. .5 394. .5 395. .6 396. .6 397. .6 398. .6 399. .6 400. .7 401. .7 402. .8 403. .9 404. .9 405. 1.1 406. 1.1 407. 1.2 408. 2.1 . list job_phdV2 `varlist' if job_phd<-2, clean job_ph~2 1. -3.6 2. -3.6 3. -3.6 4. -3.5 5. -3.5 6. -3.3 7. -3.3 8. -3.3 9. -3.3 10. -3.3 11. -3.3 12. -3.3 13. -3.3 14. -3.2 15. -3.1 16. -3 17. -3 18. -2.9 19. -2.9 20. -2.8 21. -2.8 22. -2.8 23. -2.8 24. -2.8 25. -2.8 26. -2.7 27. -2.7 28. -2.7 29. -2.7 30. -2.7 31. -2.7 32. -2.7 33. -2.7 34. -2.7 35. -2.7 36. -2.7 37. -2.7 38. -2.6 39. -2.6 40. -2.6 41. -2.6 42. -2.6 43. -2.6 44. -2.6 45. -2.5 46. -2.5 47. -2.5 48. -2.5 49. -2.5 50. -2.5 51. -2.4 52. -2.4 53. -2.4 54. -2.4 55. -2.4 56. -2.4 57. -2.3 58. -2.3 59. -2.2 60. -2.2 61. -2.2 62. -2.2 63. -2.2 64. -2.2 65. -2.2 66. -2.2 67. -2.2 68. -2.1 69. -2.1 70. -2.1 71. -2.1 72. -2.1 73. -2 74. -2 75. -2 76. -2 . . log close log: D:\wf\work\wf6-review-consistent.log log type: text closed on: 24 Oct 2008, 09:41:35 -------------------------------------------------------------------------------- . exit end of do-file . . * create and verify variables . do wf6-create . capture log close . log using wf6-create, replace text (note: file D:\wf\work\wf6-create.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-create.log log type: text opened on: 24 Oct 2008, 09:41:35 . . // program: wf6-create.do \ for stata 9 . // task: Examples of creating new variables . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 all . macro drop _all . . // #1 . // load data and define tag . . local date "2008-10-24" . local tag "wf6-create.do jsl `date'." . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // changing the meaning of a variable . . * estimate a model with lwg equal to the log of wages . logit lfp k5 k618 age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -454.32339 Iteration 2: log likelihood = -452.64187 Iteration 3: log likelihood = -452.63296 Iteration 4: log likelihood = -452.63296 Logistic regression Number of obs = 753 LR chi2(7) = 124.48 Prob > chi2 = 0.0000 Log likelihood = -452.63296 Pseudo R2 = 0.1209 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.462913 .1970006 -7.43 0.000 -1.849027 -1.076799 k618 | -.0645707 .0680008 -0.95 0.342 -.1978499 .0687085 age | -.0628706 .0127831 -4.92 0.000 -.0879249 -.0378162 wc | .8072738 .2299799 3.51 0.000 .3565215 1.258026 hc | .1117336 .2060397 0.54 0.588 -.2920969 .515564 lwg | .6046931 .1508176 4.01 0.000 .3090961 .9002901 inc | -.0344464 .0082084 -4.20 0.000 -.0505346 -.0183583 _cons | 3.18214 .6443751 4.94 0.000 1.919188 4.445092 ------------------------------------------------------------------------------ . estimates store model_1 . . * estimate a model with lwg equal to wages . replace lwg = exp(lwg) (753 real changes made) . logit lfp k5 k618 age wc hc lwg Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -448.14026 Iteration 2: log likelihood = -439.73525 Iteration 3: log likelihood = -439.02858 Iteration 4: log likelihood = -439.02463 Logistic regression Number of obs = 753 LR chi2(6) = 151.70 Prob > chi2 = 0.0000 Log likelihood = -439.02463 Pseudo R2 = 0.1473 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.457772 .200357 -7.28 0.000 -1.850464 -1.065079 k618 | -.0766757 .0689362 -1.11 0.266 -.2117881 .0584368 age | -.0707129 .0129518 -5.46 0.000 -.0960979 -.0453278 wc | .3846933 .2336943 1.65 0.100 -.0733392 .8427258 hc | -.1570584 .1988687 -0.79 0.430 -.5468338 .232717 lwg | .4035945 .0622637 6.48 0.000 .2815599 .525629 _cons | 2.395029 .6501757 3.68 0.000 1.120708 3.66935 ------------------------------------------------------------------------------ . estimates store model_2 . . * compare models . estimates table _all, stats(N bic) eform b(%9.3f) t(%6.2f) -------------------------------------- Variable | model_1 model_2 -------------+------------------------ k5 | 0.232 0.233 | -7.43 -7.28 k618 | 0.937 0.926 | -0.95 -1.11 age | 0.939 0.932 | -4.92 -5.46 wc | 2.242 1.469 | 3.51 1.65 hc | 1.118 0.855 | 0.54 -0.79 lwg | 1.831 1.497 | 4.01 6.48 inc | 0.966 | -4.20 _cons | 24.098 10.969 | 4.94 3.68 -------------+------------------------ N | 753.000 753.000 bic | 958.258 924.418 -------------------------------------- legend: b/t . . // #3 . // the log of a negative number . . generate inclog = log(inc) (1 missing value generated) . label var inclog "log(inc)" . list inc inclog if inc<0, clean inc inclog 373. -.0290001 . . . // #4 . // documenting new variables . . generate inc_log5 = ln(inc+.5) if !missing(inc) . label var inc_log5 "Log(inc+.5)" . note inc_log5: log(inc+.5) \ `tag' . . // #5 . // the generate command . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . * transform all values of age . generate agesqrt = sqrt(age) . label var agesqrt "Sqrt(age)" . drop agesqrt . . * transform only values of age greater than 5 . generate agesqrt = sqrt(age) if age>5 . label var agesqrt "Sqrt(age) if age>5" . . // #6 . // clonevar . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . * create a copy of lfp using generate . generate lfp_gen = lfp . . * create a copy using clonevar . clonevar lfp_clone = lfp . . * comparing the two variables . summarize lfp* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 lfp_gen | 753 .5683931 .4956295 0 1 lfp_clone | 753 .5683931 .4956295 0 1 . describe lfp* storage display value variable name type format label variable label ------------------------------------------------------------------------------- lfp byte %9.0g lfp In paid labor force? 1=yes 0=no lfp_gen float %9.0g lfp_clone byte %9.0g lfp In paid labor force? 1=yes 0=no . compare lfp lfp_gen ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ lfp=lfp_gen 753 ---------- jointly defined 753 0 0 0 ---------- total 753 . compare lfp lfp_clone ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ lfp=lfp_clone 753 ---------- jointly defined 753 0 0 0 ---------- total 753 . . // #7 . // replace . . use wf-russia01, clear (Workflow data to illustrate creating variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . generate educcat = edyears (159 missing values generated) . label var educcat "Categorized years of education" . replace educcat = 1 if edyears>=0 & edyears<=8 // no HS (278 real changes made) . replace educcat = 2 if edyears>=9 & edyears<=11 // some HS (501 real changes made) . replace educcat = 3 if edyears==12 // HS (205 real changes made) . replace educcat = 4 if edyears>=13 & edyears<=15 // some college (517 real changes made) . replace educcat = 5 if edyears>=16 & edyears<=24 // college plus (135 real changes made) . . label def educcat 1 1_NoHS 2 2_someHS 3 3_HS 4 4_someCol 5 5_ColPlus /// > .b b_Refused .c c_DontKnow .d d_AtSchool .e e_AtCollege /// > .f f_NoFrmlSchl . label val educcat educcat . tab1 edyears educcat, missing -> tabulation of edyears Years of | schooling | Freq. Percent Cum. -------------+----------------------------------- 1 | 3 0.17 0.17 2 | 5 0.28 0.44 3 | 21 1.17 1.61 4 | 31 1.72 3.34 5 | 17 0.95 4.28 6 | 16 0.89 5.17 7 | 107 5.95 11.12 8 | 81 4.51 15.63 9 | 34 1.89 17.52 10 | 276 15.35 32.87 11 | 191 10.62 43.49 12 | 205 11.40 54.89 13 | 192 10.68 65.57 14 | 107 5.95 71.52 15 | 218 12.12 83.65 16 | 68 3.78 87.43 17 | 26 1.45 88.88 18 | 21 1.17 90.04 19 | 12 0.67 90.71 20 | 7 0.39 91.10 21 | 1 0.06 91.16 b_Refused | 3 0.17 91.32 c_DontKnow | 61 3.39 94.72 d_AtSchool | 7 0.39 95.11 e_AtCollege | 73 4.06 99.17 f_NoFrmlSchl | 15 0.83 100.00 -------------+----------------------------------- Total | 1,798 100.00 -> tabulation of educcat Categorized | years of | education | Freq. Percent Cum. -------------+----------------------------------- 1_NoHS | 281 15.63 15.63 2_someHS | 501 27.86 43.49 3_HS | 205 11.40 54.89 4_someCol | 517 28.75 83.65 5_ColPlus | 135 7.51 91.16 b_Refused | 3 0.17 91.32 c_DontKnow | 61 3.39 94.72 d_AtSchool | 7 0.39 95.11 e_AtCollege | 73 4.06 99.17 f_NoFrmlSchl | 15 0.83 100.00 -------------+----------------------------------- Total | 1,798 100.00 . . // #8 . // indicator variable problem with missing values . . use wf-russia01, clear (Workflow data to illustrate creating variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . tab1 marstat, miss -> tabulation of marstat Marital | status | Freq. Percent Cum. ------------+----------------------------------- 1_married | 931 51.78 51.78 2_widowed | 321 17.85 69.63 3_divorced | 215 11.96 81.59 4_separated | 33 1.84 83.43 5_single | 279 15.52 98.94 .b | 19 1.06 100.00 ------------+----------------------------------- Total | 1,798 100.00 . . * incorrect . generate ismar_wrong = (marstat==1) . label var ismar_wrong "Is married created incorrectly" . label def Lyesno 0 0_no 1 1_yes . label val ismar_wrong Lyesno . tabulate marstat ismar_wrong, miss | Is married created Marital | incorrectly status | 0_no 1_yes | Total ------------+----------------------+---------- 1_married | 0 931 | 931 2_widowed | 321 0 | 321 3_divorced | 215 0 | 215 4_separated | 33 0 | 33 5_single | 279 0 | 279 .b | 19 0 | 19 ------------+----------------------+---------- Total | 867 931 | 1,798 . . * correct . generate ismar_right = (marstat==1) if !missing(marstat) (19 missing values generated) . label var ismar_right "Is married?" . label val ismar_right Lyesno . tabulate marstat ismar_right, miss Marital | Is married? status | 0_no 1_yes . | Total ------------+---------------------------------+---------- 1_married | 0 931 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 848 931 19 | 1,798 . . * fixing the extended missing value . replace ismar_right = .b if marstat==.b (19 real changes made, 19 to missing) . tabulate marstat ismar_right, miss Marital | Is married? status | 0_no 1_yes .b | Total ------------+---------------------------------+---------- 1_married | 0 931 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 848 931 19 | 1,798 . . // #9 . // recode . . * using recode . recode marstat 1=1 2/5=0, gen(ismar2_right) (848 differences between marstat and ismar2_right) . label var ismar2_right "Is married?" . tabulate marstat ismar2_right, miss Marital | Is married? status | 0 1 .b | Total ------------+---------------------------------+---------- 1_married | 0 931 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 848 931 19 | 1,798 . . * reproduce what was done with replace commands in last example . recode edyears 0/8=1 9/11=2 12=3 13/15=4 16/24=5, gen(educcat2) (1636 differences between edyears and educcat2) . compare educcat educcat2 ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ educcat2=educcat2 1639 ---------- jointly defined 1639 0 0 0 jointly missing 159 ---------- total 1798 . tabulate educcat educcat2, miss RECODE of | edyears | (Years of | RECODE of edyears (Years of schooling) schooling) | 1 2 3 4 5 | Total -----------+-------------------------------------------------------+---------- 1 | 281 0 0 0 0 | 281 2 | 0 501 0 0 0 | 501 3 | 0 0 205 0 0 | 205 4 | 0 0 0 517 0 | 517 5 | 0 0 0 0 135 | 135 .b | 0 0 0 0 0 | 3 .c | 0 0 0 0 0 | 61 .d | 0 0 0 0 0 | 7 .e | 0 0 0 0 0 | 73 .f | 0 0 0 0 0 | 15 -----------+-------------------------------------------------------+---------- Total | 281 501 205 517 135 | 1,798 RECODE of | edyears | (Years of | RECODE of edyears (Years of schooling) schooling) | .b .c .d .e .f | Total -----------+-------------------------------------------------------+---------- 1 | 0 0 0 0 0 | 281 2 | 0 0 0 0 0 | 501 3 | 0 0 0 0 0 | 205 4 | 0 0 0 0 0 | 517 5 | 0 0 0 0 0 | 135 .b | 3 0 0 0 0 | 3 .c | 0 61 0 0 0 | 61 .d | 0 0 7 0 0 | 7 .e | 0 0 0 73 0 | 73 .f | 0 0 0 0 15 | 15 -----------+-------------------------------------------------------+---------- Total | 3 61 7 73 15 | 1,798 . . * recode 1 to 0 and change all other values (including missing) to 1 . recode edyears 1=0 *=1, gen(edtest1) (1798 differences between edyears and edtest1) . tabulate edyears edtest1, miss | RECODE of edyears Years of | (Years of schooling) schooling | 0 1 | Total -------------+----------------------+---------- 1 | 3 0 | 3 2 | 0 5 | 5 3 | 0 21 | 21 4 | 0 31 | 31 5 | 0 17 | 17 6 | 0 16 | 16 7 | 0 107 | 107 8 | 0 81 | 81 9 | 0 34 | 34 10 | 0 276 | 276 11 | 0 191 | 191 12 | 0 205 | 205 13 | 0 192 | 192 14 | 0 107 | 107 15 | 0 218 | 218 16 | 0 68 | 68 17 | 0 26 | 26 18 | 0 21 | 21 19 | 0 12 | 12 20 | 0 7 | 7 21 | 0 1 | 1 b_Refused | 0 3 | 3 c_DontKnow | 0 61 | 61 d_AtSchool | 0 7 | 7 e_AtCollege | 0 73 | 73 f_NoFrmlSchl | 0 15 | 15 -------------+----------------------+---------- Total | 3 1,795 | 1,798 . . * recode 1 to 0, else 1 except for missing . recode edyears 1=0 *=1 if !missing(edyears), gen(edtest2) (1639 differences between edyears and edtest2) . tabulate edyears edtest2, miss | RECODE of edyears (Years of Years of | schooling) schooling | 0 1 . | Total -------------+---------------------------------+---------- 1 | 3 0 0 | 3 2 | 0 5 0 | 5 3 | 0 21 0 | 21 4 | 0 31 0 | 31 5 | 0 17 0 | 17 6 | 0 16 0 | 16 7 | 0 107 0 | 107 8 | 0 81 0 | 81 9 | 0 34 0 | 34 10 | 0 276 0 | 276 11 | 0 191 0 | 191 12 | 0 205 0 | 205 13 | 0 192 0 | 192 14 | 0 107 0 | 107 15 | 0 218 0 | 218 16 | 0 68 0 | 68 17 | 0 26 0 | 26 18 | 0 21 0 | 21 19 | 0 12 0 | 12 20 | 0 7 0 | 7 21 | 0 1 0 | 1 b_Refused | 0 0 3 | 3 c_DontKnow | 0 0 61 | 61 d_AtSchool | 0 0 7 | 7 e_AtCollege | 0 0 73 | 73 f_NoFrmlSchl | 0 0 15 | 15 -------------+---------------------------------+---------- Total | 3 1,636 159 | 1,798 . . * keep 1, 2, 3, 4, 5 the same; recode 6-24 to 6, except missing . recode edyears 6/24=6 if !missing(edyears), gen(edtest3) (1546 differences between edyears and edtest3) . tabulate edyears edtest3, miss Years of | RECODE of edyears (Years of schooling) schooling | 1 2 3 4 | Total -------------+--------------------------------------------+---------- 1 | 3 0 0 0 | 3 2 | 0 5 0 0 | 5 3 | 0 0 21 0 | 21 4 | 0 0 0 31 | 31 5 | 0 0 0 0 | 17 6 | 0 0 0 0 | 16 7 | 0 0 0 0 | 107 8 | 0 0 0 0 | 81 9 | 0 0 0 0 | 34 10 | 0 0 0 0 | 276 11 | 0 0 0 0 | 191 12 | 0 0 0 0 | 205 13 | 0 0 0 0 | 192 14 | 0 0 0 0 | 107 15 | 0 0 0 0 | 218 16 | 0 0 0 0 | 68 17 | 0 0 0 0 | 26 18 | 0 0 0 0 | 21 19 | 0 0 0 0 | 12 20 | 0 0 0 0 | 7 21 | 0 0 0 0 | 1 b_Refused | 0 0 0 0 | 3 c_DontKnow | 0 0 0 0 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 3 5 21 31 | 1,798 | RECODE of edyears (Years of Years of | schooling) schooling | 5 6 . | Total -------------+---------------------------------+---------- 1 | 0 0 0 | 3 2 | 0 0 0 | 5 3 | 0 0 0 | 21 4 | 0 0 0 | 31 5 | 17 0 0 | 17 6 | 0 16 0 | 16 7 | 0 107 0 | 107 8 | 0 81 0 | 81 9 | 0 34 0 | 34 10 | 0 276 0 | 276 11 | 0 191 0 | 191 12 | 0 205 0 | 205 13 | 0 192 0 | 192 14 | 0 107 0 | 107 15 | 0 218 0 | 218 16 | 0 68 0 | 68 17 | 0 26 0 | 26 18 | 0 21 0 | 21 19 | 0 12 0 | 12 20 | 0 7 0 | 7 21 | 0 1 0 | 1 b_Refused | 0 0 3 | 3 c_DontKnow | 0 0 61 | 61 d_AtSchool | 0 0 7 | 7 e_AtCollege | 0 0 73 | 73 f_NoFrmlSchl | 0 0 15 | 15 -------------+---------------------------------+---------- Total | 17 1,562 159 | 1,798 . . * recode 1 3 5 7 9 to -1, others unchanged . recode edyears 1 3 5 7 9=-1, gen(edtest4) (182 differences between edyears and edtest4) . tabulate edyears edtest4, miss Years of | RECODE of edyears (Years of schooling) schooling | -1 2 4 6 | Total -------------+--------------------------------------------+---------- 1 | 3 0 0 0 | 3 2 | 0 5 0 0 | 5 3 | 21 0 0 0 | 21 4 | 0 0 31 0 | 31 5 | 17 0 0 0 | 17 6 | 0 0 0 16 | 16 7 | 107 0 0 0 | 107 8 | 0 0 0 0 | 81 9 | 34 0 0 0 | 34 10 | 0 0 0 0 | 276 11 | 0 0 0 0 | 191 12 | 0 0 0 0 | 205 13 | 0 0 0 0 | 192 14 | 0 0 0 0 | 107 15 | 0 0 0 0 | 218 16 | 0 0 0 0 | 68 17 | 0 0 0 0 | 26 18 | 0 0 0 0 | 21 19 | 0 0 0 0 | 12 20 | 0 0 0 0 | 7 21 | 0 0 0 0 | 1 b_Refused | 0 0 0 0 | 3 c_DontKnow | 0 0 0 0 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 182 5 31 16 | 1,798 Years of | RECODE of edyears (Years of schooling) schooling | 8 10 11 12 | Total -------------+--------------------------------------------+---------- 1 | 0 0 0 0 | 3 2 | 0 0 0 0 | 5 3 | 0 0 0 0 | 21 4 | 0 0 0 0 | 31 5 | 0 0 0 0 | 17 6 | 0 0 0 0 | 16 7 | 0 0 0 0 | 107 8 | 81 0 0 0 | 81 9 | 0 0 0 0 | 34 10 | 0 276 0 0 | 276 11 | 0 0 191 0 | 191 12 | 0 0 0 205 | 205 13 | 0 0 0 0 | 192 14 | 0 0 0 0 | 107 15 | 0 0 0 0 | 218 16 | 0 0 0 0 | 68 17 | 0 0 0 0 | 26 18 | 0 0 0 0 | 21 19 | 0 0 0 0 | 12 20 | 0 0 0 0 | 7 21 | 0 0 0 0 | 1 b_Refused | 0 0 0 0 | 3 c_DontKnow | 0 0 0 0 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 81 276 191 205 | 1,798 Years of | RECODE of edyears (Years of schooling) schooling | 13 14 15 16 | Total -------------+--------------------------------------------+---------- 1 | 0 0 0 0 | 3 2 | 0 0 0 0 | 5 3 | 0 0 0 0 | 21 4 | 0 0 0 0 | 31 5 | 0 0 0 0 | 17 6 | 0 0 0 0 | 16 7 | 0 0 0 0 | 107 8 | 0 0 0 0 | 81 9 | 0 0 0 0 | 34 10 | 0 0 0 0 | 276 11 | 0 0 0 0 | 191 12 | 0 0 0 0 | 205 13 | 192 0 0 0 | 192 14 | 0 107 0 0 | 107 15 | 0 0 218 0 | 218 16 | 0 0 0 68 | 68 17 | 0 0 0 0 | 26 18 | 0 0 0 0 | 21 19 | 0 0 0 0 | 12 20 | 0 0 0 0 | 7 21 | 0 0 0 0 | 1 b_Refused | 0 0 0 0 | 3 c_DontKnow | 0 0 0 0 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 192 107 218 68 | 1,798 Years of | RECODE of edyears (Years of schooling) schooling | 17 18 19 20 | Total -------------+--------------------------------------------+---------- 1 | 0 0 0 0 | 3 2 | 0 0 0 0 | 5 3 | 0 0 0 0 | 21 4 | 0 0 0 0 | 31 5 | 0 0 0 0 | 17 6 | 0 0 0 0 | 16 7 | 0 0 0 0 | 107 8 | 0 0 0 0 | 81 9 | 0 0 0 0 | 34 10 | 0 0 0 0 | 276 11 | 0 0 0 0 | 191 12 | 0 0 0 0 | 205 13 | 0 0 0 0 | 192 14 | 0 0 0 0 | 107 15 | 0 0 0 0 | 218 16 | 0 0 0 0 | 68 17 | 26 0 0 0 | 26 18 | 0 21 0 0 | 21 19 | 0 0 12 0 | 12 20 | 0 0 0 7 | 7 21 | 0 0 0 0 | 1 b_Refused | 0 0 0 0 | 3 c_DontKnow | 0 0 0 0 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 26 21 12 7 | 1,798 Years of | RECODE of edyears (Years of schooling) schooling | 21 .b .c .d | Total -------------+--------------------------------------------+---------- 1 | 0 0 0 0 | 3 2 | 0 0 0 0 | 5 3 | 0 0 0 0 | 21 4 | 0 0 0 0 | 31 5 | 0 0 0 0 | 17 6 | 0 0 0 0 | 16 7 | 0 0 0 0 | 107 8 | 0 0 0 0 | 81 9 | 0 0 0 0 | 34 10 | 0 0 0 0 | 276 11 | 0 0 0 0 | 191 12 | 0 0 0 0 | 205 13 | 0 0 0 0 | 192 14 | 0 0 0 0 | 107 15 | 0 0 0 0 | 218 16 | 0 0 0 0 | 68 17 | 0 0 0 0 | 26 18 | 0 0 0 0 | 21 19 | 0 0 0 0 | 12 20 | 0 0 0 0 | 7 21 | 1 0 0 0 | 1 b_Refused | 0 3 0 0 | 3 c_DontKnow | 0 0 61 0 | 61 d_AtSchool | 0 0 0 7 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 1 3 61 7 | 1,798 | RECODE of edyears Years of | (Years of schooling) schooling | .e .f | Total -------------+----------------------+---------- 1 | 0 0 | 3 2 | 0 0 | 5 3 | 0 0 | 21 4 | 0 0 | 31 5 | 0 0 | 17 6 | 0 0 | 16 7 | 0 0 | 107 8 | 0 0 | 81 9 | 0 0 | 34 10 | 0 0 | 276 11 | 0 0 | 191 12 | 0 0 | 205 13 | 0 0 | 192 14 | 0 0 | 107 15 | 0 0 | 218 16 | 0 0 | 68 17 | 0 0 | 26 18 | 0 0 | 21 19 | 0 0 | 12 20 | 0 0 | 7 21 | 0 0 | 1 b_Refused | 0 0 | 3 c_DontKnow | 0 0 | 61 d_AtSchool | 0 0 | 7 e_AtCollege | 73 0 | 73 f_NoFrmlSchl | 0 15 | 15 -------------+----------------------+---------- Total | 73 15 | 1,798 . . * recode 6 to max to 6, others unchanged . recode edyears 6/max=6, gen(edtest5) (1546 differences between edyears and edtest5) . tabulate edyears edtest5, miss Years of | RECODE of edyears (Years of schooling) schooling | 1 2 3 4 | Total -------------+--------------------------------------------+---------- 1 | 3 0 0 0 | 3 2 | 0 5 0 0 | 5 3 | 0 0 21 0 | 21 4 | 0 0 0 31 | 31 5 | 0 0 0 0 | 17 6 | 0 0 0 0 | 16 7 | 0 0 0 0 | 107 8 | 0 0 0 0 | 81 9 | 0 0 0 0 | 34 10 | 0 0 0 0 | 276 11 | 0 0 0 0 | 191 12 | 0 0 0 0 | 205 13 | 0 0 0 0 | 192 14 | 0 0 0 0 | 107 15 | 0 0 0 0 | 218 16 | 0 0 0 0 | 68 17 | 0 0 0 0 | 26 18 | 0 0 0 0 | 21 19 | 0 0 0 0 | 12 20 | 0 0 0 0 | 7 21 | 0 0 0 0 | 1 b_Refused | 0 0 0 0 | 3 c_DontKnow | 0 0 0 0 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 3 5 21 31 | 1,798 Years of | RECODE of edyears (Years of schooling) schooling | 5 6 .b .c | Total -------------+--------------------------------------------+---------- 1 | 0 0 0 0 | 3 2 | 0 0 0 0 | 5 3 | 0 0 0 0 | 21 4 | 0 0 0 0 | 31 5 | 17 0 0 0 | 17 6 | 0 16 0 0 | 16 7 | 0 107 0 0 | 107 8 | 0 81 0 0 | 81 9 | 0 34 0 0 | 34 10 | 0 276 0 0 | 276 11 | 0 191 0 0 | 191 12 | 0 205 0 0 | 205 13 | 0 192 0 0 | 192 14 | 0 107 0 0 | 107 15 | 0 218 0 0 | 218 16 | 0 68 0 0 | 68 17 | 0 26 0 0 | 26 18 | 0 21 0 0 | 21 19 | 0 12 0 0 | 12 20 | 0 7 0 0 | 7 21 | 0 1 0 0 | 1 b_Refused | 0 0 3 0 | 3 c_DontKnow | 0 0 0 61 | 61 d_AtSchool | 0 0 0 0 | 7 e_AtCollege | 0 0 0 0 | 73 f_NoFrmlSchl | 0 0 0 0 | 15 -------------+--------------------------------------------+---------- Total | 17 1,562 3 61 | 1,798 | RECODE of edyears (Years of Years of | schooling) schooling | .d .e .f | Total -------------+---------------------------------+---------- 1 | 0 0 0 | 3 2 | 0 0 0 | 5 3 | 0 0 0 | 21 4 | 0 0 0 | 31 5 | 0 0 0 | 17 6 | 0 0 0 | 16 7 | 0 0 0 | 107 8 | 0 0 0 | 81 9 | 0 0 0 | 34 10 | 0 0 0 | 276 11 | 0 0 0 | 191 12 | 0 0 0 | 205 13 | 0 0 0 | 192 14 | 0 0 0 | 107 15 | 0 0 0 | 218 16 | 0 0 0 | 68 17 | 0 0 0 | 26 18 | 0 0 0 | 21 19 | 0 0 0 | 12 20 | 0 0 0 | 7 21 | 0 0 0 | 1 b_Refused | 0 0 0 | 3 c_DontKnow | 0 0 0 | 61 d_AtSchool | 7 0 0 | 7 e_AtCollege | 0 73 0 | 73 f_NoFrmlSchl | 0 0 15 | 15 -------------+---------------------------------+---------- Total | 7 73 15 | 1,798 . . // #10 . // egen to standardize age . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . * standardize using generate and summarize . summarize age Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 753 42.53785 8.072574 30 60 . generate agestd = (age - r(mean)) / r(sd) . label var agestd "Age standardized using generate" . . * use egenerate std . egen agestdV2 = std(age) . label var agestdV2 "Age standardized using egen" . . * compare . compare agestd agestdV2 ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ agestd=agestdV2 753 ---------- jointly defined 753 0 0 0 ---------- total 753 . summarize agestd agestdV2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- agestd | 753 -7.05e-09 1 -1.553141 2.163145 agestdV2 | 753 -7.05e-09 1 -1.553141 2.163145 . regress agestdV2 agestd Source | SS df MS Number of obs = 753 -------------+------------------------------ F( 1, 751) = . Model | 752 1 752 Prob > F = . Residual | 0 751 0 R-squared = 1.0000 -------------+------------------------------ Adj R-squared = 1.0000 Total | 752 752 1 Root MSE = 0 ------------------------------------------------------------------------------ agestdV2 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- agestd | 1 . . . . . _cons | 3.56e-25 . . . . . ------------------------------------------------------------------------------ . . // #11 . // egen anycount . . * anycount(varlist), values(integer numlist): returns the number of . * variables in varlist for which values are equal to any of the integer . * values in a supplied numlist. Values for any observations excluded . * by either [if] or [in] are set to 0 (not missing). . . egen count0 = anycount(lfp k5 k618 age wc hc lwg inc), values(0) . label var count0 "# of 0's in lfp k5 k618 age wc hc lwg inc" . tabulate count0, miss # of 0's in | lfp k5 k618 | age wc hc | lwg inc | Freq. Percent Cum. ------------+----------------------------------- 0 | 11 1.46 1.46 1 | 94 12.48 13.94 2 | 157 20.85 34.79 3 | 251 33.33 68.13 4 | 169 22.44 90.57 5 | 71 9.43 100.00 ------------+----------------------------------- Total | 753 100.00 . . * computing the same thing with a foreach loop . generate count0v2 = 0 . label var count0v2 "v2:# of 0's in lfp k5 k618 age wc hc lwg inc" . foreach var in lfp k5 k618 age wc hc lwg inc { 2. replace count0v2 = count0v2 + 1 if `var'==0 3. } (325 real changes made) (606 real changes made) (258 real changes made) (0 real changes made) (541 real changes made) (458 real changes made) (4 real changes made) (0 real changes made) . . compare count0 count0v2 ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ count0=count0v2 753 ---------- jointly defined 753 0 0 0 ---------- total 753 . . // #12 . // tabulate, generate . . use wf-russia01, clear (Workflow data to illustrate creating variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . tabulate marstat, gen(ms_is) Marital | status | Freq. Percent Cum. ------------+----------------------------------- 1_married | 931 52.33 52.33 2_widowed | 321 18.04 70.38 3_divorced | 215 12.09 82.46 4_separated | 33 1.85 84.32 5_single | 279 15.68 100.00 ------------+----------------------------------- Total | 1,779 100.00 . . codebook ms_is*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- ms_is1 1779 2 .5233277 0 1 marstat==1_married ms_is2 1779 2 .1804384 0 1 marstat==2_widowed ms_is3 1779 2 .1208544 0 1 marstat==3_divorced ms_is4 1779 2 .0185497 0 1 marstat==4_separated ms_is5 1779 2 .1568297 0 1 marstat==5_single -------------------------------------------------------------------------------- . describe ms_is* storage display value variable name type format label variable label ------------------------------------------------------------------------------- ms_is1 byte %8.0g marstat==1_married ms_is2 byte %8.0g marstat==2_widowed ms_is3 byte %8.0g marstat==3_divorced ms_is4 byte %8.0g marstat==4_separated ms_is5 byte %8.0g marstat==5_single . summarize ms_is* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ms_is1 | 1779 .5233277 .499596 0 1 ms_is2 | 1779 .1804384 .3846604 0 1 ms_is3 | 1779 .1208544 .3260497 0 1 ms_is4 | 1779 .0185497 .1349663 0 1 ms_is5 | 1779 .1568297 .3637424 0 1 . tabulate marstat ms_is1, miss Marital | marstat==1_married status | 0 1 . | Total ------------+---------------------------------+---------- 1_married | 0 931 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 848 931 19 | 1,798 . . * clean up the variables . label def Lyesno 0 0_no 1 1_yes . . rename ms_is1 ms_married . note ms_married: Source var is marstat \ `tag'. . label var ms_married "Married?" . label val ms_married Lyesno . tabulate marstat ms_married, miss Marital | Married? status | 0_no 1_yes . | Total ------------+---------------------------------+---------- 1_married | 0 931 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 848 931 19 | 1,798 . . rename ms_is2 ms_widowed . note ms_widowed: Source var is marstat \ `tag'. . label var ms_widowed "Widowed?" . label val ms_widowed Lyesno . tabulate marstat ms_widowed, miss Marital | Widowed? status | 0_no 1_yes . | Total ------------+---------------------------------+---------- 1_married | 931 0 0 | 931 2_widowed | 0 321 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 1,458 321 19 | 1,798 . . rename ms_is3 ms_divorced . note ms_divorced: Source var is marstat \ `tag'. . label var ms_divorced "Divorced?" . label val ms_divorced Lyesno . tabulate marstat ms_divorced, miss Marital | Divorced? status | 0_no 1_yes . | Total ------------+---------------------------------+---------- 1_married | 931 0 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 0 215 0 | 215 4_separated | 33 0 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 1,564 215 19 | 1,798 . . rename ms_is4 ms_separated . note ms_separated: Source var is marstat \ `tag'. . label var ms_separated "Seperated?" . label val ms_separated Lyesno . tabulate marstat ms_separated, miss Marital | Seperated? status | 0_no 1_yes . | Total ------------+---------------------------------+---------- 1_married | 931 0 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 0 33 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 1,746 33 19 | 1,798 . . rename ms_is5 ms_single . note ms_single: Source var is marstat \ `tag'. . label var ms_single "Single?" . label val ms_single Lyesno . tabulate marstat ms_single, miss Marital | Single? status | 0_no 1_yes . | Total ------------+---------------------------------+---------- 1_married | 931 0 0 | 931 2_widowed | 321 0 0 | 321 3_divorced | 215 0 0 | 215 4_separated | 33 0 0 | 33 5_single | 0 279 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 1,500 279 19 | 1,798 . . notes ms_* ms_married: 1. Source var is marstat \ wf6-create.do jsl 2008-10-24.. ms_widowed: 1. Source var is marstat \ wf6-create.do jsl 2008-10-24.. ms_divorced: 1. Source var is marstat \ wf6-create.do jsl 2008-10-24.. ms_separated: 1. Source var is marstat \ wf6-create.do jsl 2008-10-24.. ms_single: 1. Source var is marstat \ wf6-create.do jsl 2008-10-24.. . . * easy way to check for problems with the indicators . regress ms_married ms_widowed ms_divorced ms_separated ms_single Source | SS df MS Number of obs = 1779 -------------+------------------------------ F( 4, 1774) = . Model | 443.7819 4 110.945475 Prob > F = . Residual | 0 1774 0 R-squared = 1.0000 -------------+------------------------------ Adj R-squared = 1.0000 Total | 443.7819 1778 .249596119 Root MSE = 0 ------------------------------------------------------------------------------ ms_married | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ms_widowed | -1 . . . . . ms_divorced | -1 . . . . . ms_separated | -1 . . . . . ms_single | -1 . . . . . _cons | 1 . . . . . ------------------------------------------------------------------------------ . . // #13 . // estimate two models and compute predictions . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . * model 1 . logit lfp k5 k618 age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -454.32339 Iteration 2: log likelihood = -452.64187 Iteration 3: log likelihood = -452.63296 Iteration 4: log likelihood = -452.63296 Logistic regression Number of obs = 753 LR chi2(7) = 124.48 Prob > chi2 = 0.0000 Log likelihood = -452.63296 Pseudo R2 = 0.1209 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.462913 .1970006 -7.43 0.000 -1.849027 -1.076799 k618 | -.0645707 .0680008 -0.95 0.342 -.1978499 .0687085 age | -.0628706 .0127831 -4.92 0.000 -.0879249 -.0378162 wc | .8072738 .2299799 3.51 0.000 .3565215 1.258026 hc | .1117336 .2060397 0.54 0.588 -.2920969 .515564 lwg | .6046931 .1508176 4.01 0.000 .3090961 .9002901 inc | -.0344464 .0082084 -4.20 0.000 -.0505346 -.0183583 _cons | 3.18214 .6443751 4.94 0.000 1.919188 4.445092 ------------------------------------------------------------------------------ . predict prm1 (option p assumed; Pr(lfp)) . . * model 2 . logit lfp age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -486.1315 Iteration 2: log likelihood = -485.87567 Iteration 3: log likelihood = -485.87551 Logistic regression Number of obs = 753 LR chi2(5) = 58.00 Prob > chi2 = 0.0000 Log likelihood = -485.87551 Pseudo R2 = 0.0563 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0169788 .0096886 -1.75 0.080 -.0359682 .0020105 wc | .6524283 .2155619 3.03 0.002 .2299348 1.074922 hc | .0285808 .1954884 0.15 0.884 -.3545694 .4117311 lwg | .6157264 .1452656 4.24 0.000 .331011 .9004418 inc | -.0328025 .0076386 -4.29 0.000 -.0477739 -.0178311 _cons | .8098897 .4510786 1.80 0.073 -.0742082 1.693988 ------------------------------------------------------------------------------ . predict prm2 (option p assumed; Pr(lfp)) . . * check predictions . codebook prm*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- prm1 753 753 .5683931 .0139875 .9621198 Pr(lfp) prm2 753 753 .5683931 .1012935 .8985487 Pr(lfp) -------------------------------------------------------------------------------- . nmlab prm* prm1 Pr(lfp) prm2 Pr(lfp) . describe prm* storage display value variable name type format label variable label ------------------------------------------------------------------------------- prm1 float %9.0g Pr(lfp) prm2 float %9.0g Pr(lfp) . summarize prm* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- prm1 | 753 .5683931 .1944213 .0139875 .9621198 prm2 | 753 .5683931 .1347343 .1012935 .8985487 . . // #14 . // estimate and predict with better labels . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . * model 1 . logit lfp k5 k618 age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -454.32339 Iteration 2: log likelihood = -452.64187 Iteration 3: log likelihood = -452.63296 Iteration 4: log likelihood = -452.63296 Logistic regression Number of obs = 753 LR chi2(7) = 124.48 Prob > chi2 = 0.0000 Log likelihood = -452.63296 Pseudo R2 = 0.1209 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.462913 .1970006 -7.43 0.000 -1.849027 -1.076799 k618 | -.0645707 .0680008 -0.95 0.342 -.1978499 .0687085 age | -.0628706 .0127831 -4.92 0.000 -.0879249 -.0378162 wc | .8072738 .2299799 3.51 0.000 .3565215 1.258026 hc | .1117336 .2060397 0.54 0.588 -.2920969 .515564 lwg | .6046931 .1508176 4.01 0.000 .3090961 .9002901 inc | -.0344464 .0082084 -4.20 0.000 -.0505346 -.0183583 _cons | 3.18214 .6443751 4.94 0.000 1.919188 4.445092 ------------------------------------------------------------------------------ . predict prm1 (option p assumed; Pr(lfp)) . label var prm1 "Pr(lfp|m1=k5 k618 age wc hc lwg inc)" . note prm1: m1=logit lfp k5 k618 age wc hc lwg inc \ `tag'. . . * model 2 . logit lfp age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -486.1315 Iteration 2: log likelihood = -485.87567 Iteration 3: log likelihood = -485.87551 Logistic regression Number of obs = 753 LR chi2(5) = 58.00 Prob > chi2 = 0.0000 Log likelihood = -485.87551 Pseudo R2 = 0.0563 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0169788 .0096886 -1.75 0.080 -.0359682 .0020105 wc | .6524283 .2155619 3.03 0.002 .2299348 1.074922 hc | .0285808 .1954884 0.15 0.884 -.3545694 .4117311 lwg | .6157264 .1452656 4.24 0.000 .331011 .9004418 inc | -.0328025 .0076386 -4.29 0.000 -.0477739 -.0178311 _cons | .8098897 .4510786 1.80 0.073 -.0742082 1.693988 ------------------------------------------------------------------------------ . predict prm2 (option p assumed; Pr(lfp)) . label var prm2 "Pr(lfp|m2=age wc hc lwg inc)" . note prm2: m2=logit age wc hc lwg inc \ `tag'. . . * check predictions . codebook prm*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- prm1 753 753 .5683931 .0139875 .9621198 Pr(lfp|m1=k5 k618 age wc... prm2 753 753 .5683931 .1012935 .8985487 Pr(lfp|m2=age wc hc lwg ... -------------------------------------------------------------------------------- . notes prm* prm1: 1. m1=logit lfp k5 k618 age wc hc lwg inc \ wf6-create.do jsl 2008-10-24.. prm2: 1. m2=logit age wc hc lwg inc \ wf6-create.do jsl 2008-10-24.. . . log close log: D:\wf\work\wf6-create.log log type: text closed on: 24 Oct 2008, 09:41:37 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-verify.do . capture log close . log using wf6-verify, replace text (note: file D:\wf\work\wf6-verify.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-verify.log log type: text opened on: 24 Oct 2008, 09:41:37 . . // program: wf6-verify.do \ for stata 9 . // task: Verifying your variables . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // note: uniform() was replaced be runiform() in Stata 10.1 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-verify, clear (Workflow data to illustrate verifying variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #2 . // listing values . . *create var that recodes fincome to the midpoint of the range . generate finc_mid = fincome (204 missing values generated) . label var finc_mid "Income coded at the midpoint" . note finc_mid: midpoints for fincome \ wf6-verify.do jsl 2008-10-24. . note finc_mid: high value is 1.25X truncation point \ wf6-verify.do jsl 2008-1 > 0-24. . recode finc_mid /// > 1=1.5 2=4 3=6 4=8 5=9.5 6=10.5 7=11.5 8=12.5 /// > 9=13.5 10=14.5 11=16 12=18.5 13=21 14=23.5 15=23.5 16=32.5 /// > 17=37.5 18=42.5 19=47.5 20=55 21=67.5 22=82.5 23=97.5 24=131.25 (finc_mid: 2283 changes made) . . *create a random variable . set seed 1951 . generate xselect = int( (uniform()*_N)+ 1 ) // renamed to runiform() in stata > 10.1 . label var xselect "Random numbers from 1 to _N" . summarize xselect // verify range Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- xselect | 2487 1244.405 719.3674 1 2487 . . *look at a random selection of observations . sort fincome . list fincome finc_mid if xselect<20, clean fincome finc_mid 76. 2_3-5K 4 173. 3_5-7K 6 283. 4_7-9K 8 345. 5_9-10K 9.5 457. 8_12-13K 12.5 683. 12_17-20K 18.5 767. 13_20-22K 21 895. 14_22-25K 23.5 914. 14_22-25K 23.5 987. 15_25-30K 23.5 1112. 15_25-30K 23.5 1164. 16_30-35K 32.5 1209. 16_30-35K 32.5 1373. 17_35-40K 37.5 1658. 19_45-50K 47.5 1686. 19_45-50K 47.5 1800. 20_50-60K 55 1971. 21_60-75K 67.5 2023. 21_60-75K 67.5 2037. 21_60-75K 67.5 2048. 21_60-75K 67.5 2333. . . 2468. . . . . // #3 . // a continuous variable plot . . generate inc_sqrt = sqrt(inc) if !missing(inc) (1764 missing values generated) . label var inc_sqrt "Square root of inc" . . scatter inc_sqrt inc, msymbol(circle_hollow) . graph export wf6-verify-scatter.eps, replace (note: file wf6-verify-scatter.eps not found) (file wf6-verify-scatter.eps written in EPS format) . . // #4 . // recoding midpoints . . use wf-verify, clear (Workflow data to illustrate verifying variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . tabulate fincome, miss Income | Freq. Percent Cum. ------------+----------------------------------- 1_<3K | 67 2.69 2.69 2_3-5K | 70 2.81 5.51 3_5-7K | 84 3.38 8.89 4_7-9K | 82 3.30 12.18 5_9-10K | 43 1.73 13.91 6_10-11K | 64 2.57 16.49 7_11-12K | 40 1.61 18.09 8_12-13K | 62 2.49 20.59 9_13-14K | 45 1.81 22.40 10_14-15K | 42 1.69 24.09 11_15-17K | 77 3.10 27.18 12_17-20K | 89 3.58 30.76 13_20-22K | 86 3.46 34.22 14_22-25K | 126 5.07 39.28 15_25-30K | 174 7.00 46.28 16_30-35K | 178 7.16 53.44 17_35-40K | 130 5.23 58.67 18_40-45K | 151 6.07 64.74 19_45-50K | 112 4.50 69.24 20_50-60K | 171 6.88 76.12 21_60-75K | 185 7.44 83.55 22_75-90K | 85 3.42 86.97 23_90-105K | 42 1.69 88.66 24_>105K | 78 3.14 91.80 . | 202 8.12 99.92 .a | 2 0.08 100.00 ------------+----------------------------------- Total | 2,487 100.00 . generate finc_mid = fincome (204 missing values generated) . label var finc_mid "Income coded at the midpoint" . note finc_mid: midpoints for fincome; upper range is 1.25X /// > truncation point \ wf6-verify.do jsl 2008-10-24. . . recode finc_mid /// > 1=1.5 2=4 3=6 4=8 5=9.5 6=10.5 7=11.5 8=12.5 /// > 9=13.5 10=14.5 11=16 12=18.5 13=21 14=23.5 15=23.5 16=32.5 /// > 17=37.5 18=42.5 19=47.5 20=55 21=67.5 22=82.5 23=97.5 24=131.25 (finc_mid: 2283 changes made) . . scatter finc_mid fincome, msymbol(circle_hollow) . graph export wf6-verify-xyscatter.eps, replace (note: file wf6-verify-xyscatter.eps not found) (file wf6-verify-xyscatter.eps written in EPS format) . . scatter fincome finc_mid, msymbol(circle_hollow) . graph export wf6-verify-yxscatter.eps, replace (note: file wf6-verify-yxscatter.eps not found) (file wf6-verify-yxscatter.eps written in EPS format) . . // #5 . // checking missing values with tabulate . . generate inc_sqrt = sqrt(inc) (1764 missing values generated) . label var inc_sqrt "Square root family income excluding wife's" . tabulate inc inc_sqrt if missing(inc) | missing(inc_sqrt), miss | Square | root | family Family | income income | excluding excluding | wife's wife's | . | Total -----------+-----------+---------- -.0290001 | 1 | 1 . | 1,742 | 1,742 .a | 5 | 5 .b | 16 | 16 -----------+-----------+---------- Total | 1,764 | 1,764 . . // #6 . // compare two ways of creating the same variable . . * use recode . recode edyears 0/8=1 9/11=2 12=3 13/15=4 16/24=5, gen(educcat) (1636 differences between edyears and educcat) . . * create variable with gen and replace . generate educcatV2 = edyears (848 missing values generated) . replace educcatV2 = 1 if edyears>=0 & edyears<=8 // no HS (278 real changes made) . replace educcatV2 = 2 if edyears>=9 & edyears<=11 // some HS (501 real changes made) . replace educcatV2 = 3 if edyears==12 // HS (205 real changes made) . replace educcatV2 = 4 if edyears>=13 & edyears<=15 // some college (517 real changes made) . replace educcatV2 = 5 if edyears>=16 & edyears<=24 // college plus (135 real changes made) . label var educcatV2 "categorize educ using replace" . . compare educcat educcatV2 ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ educcat=educcatV2 1639 ---------- jointly defined 1639 0 0 0 jointly missing 848 ---------- total 2487 . . log close log: D:\wf\work\wf6-verify.log log type: text closed on: 24 Oct 2008, 09:41:39 -------------------------------------------------------------------------------- . exit end of do-file . . * save files . do wf6-save.do . capture log close . log using wf6-save, replace text (note: file D:\wf\work\wf6-save.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-save.log log type: text opened on: 24 Oct 2008, 09:41:40 . . // program: wf6-save.do \ for stata 9 . // task: Saving datasets . // project: Workflow - Chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // dropping variables without variations . . use wf-isspru01, clear (Workflow data from Russian ISSP 2002 \ 2008-04-02) . * in stata 10 and later: datasignature confirm . codebook, problems Potential problems in dataset wf-isspru01.dta potential problem variables -------------------------------------------------------------------------------- constant (or all missing) vars v1 v3 v206 v207 v208 v209 v210 v211 v212 v213 v214 v215 v216 v217 v218 v219 v220 v221 v222 v223 v224 v225 v226 v227 v228 v229 v230 v231 v233 v234 v235 v236 v237 v238 v248 v280 v287 v290 v291 v337 v358 v359 v360 v362 incompletely labeled vars v36 v37 v69 v71 v201 v204 v240 v243 v249 v250 v361 -------------------------------------------------------------------------------- . * place variables without variation in local and drop them . local dropvars = r(cons) . drop `dropvars' . describe Contains data from wf-isspru01.dta obs: 1,798 Workflow data from Russian ISSP 2002 \ 2008-04-02 vars: 95 2 Apr 2008 13:29 size: 206,770 (99.8% of memory free) (_dta has notes) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- v2 long %10.0g Respondent Number v4 byte %10.0g v4 Workg mom: warm relation child ok v5 byte %10.0g v5 Workg mom: pre school child suffers v6 byte %10.0g v6 Workg woman: family life suffers v7 byte %10.0g v7 What women really want is home & kids v8 byte %10.0g v8 Household satisfies as much as paid job v9 byte %10.0g v9 Work is best for womens independence v10 byte %10.0g v10 Both should contribute to hh income v11 byte %10.0g v11 Mens job is work,womens job household v12 byte %10.0g v12 Men should do larger share of hh work v13 byte %10.0g v13 Men should do larger share of childcare v14 byte %10.0g v14 Shld women work:after marr.before kids v15 byte %10.0g v15 Shld women work:child under school age v16 byte %10.0g v16 Shld women work:youngest kid at school v17 byte %10.0g v17 Shld women work: when kids left home v18 byte %10.0g v18 Marriage: married people gen. happier v19 byte %10.0g v19 Bad marriage better than no marriage v20 byte %10.0g v20 Marriage better,if people want kids v21 byte %10.0g v21 Single parent can raise child as well v22 byte %10.0g v22 Couple livg together without marriage v23 byte %10.0g v23 Couple live together bef. get married v24 byte %10.0g v24 Divorce best solution w marr. problems v25 byte %10.0g v25 Children: watching up is greatest joy v26 byte %10.0g v26 People without kids: lead empty lives v27 byte %10.0g v27 Workg women shld:paid maternity leave v28 byte %10.0g v28 Workg parents shld:financial benefits v29 byte %10.0g v29 Organizing income in partnership v30 byte %10.0g v30 Division of hh work: doing the laundry v31 byte %10.0g v31 Division of hh work: small repairs v32 byte %10.0g v32 Div. hh work:care f sick fam.members v33 byte %10.0g v33 Division of hh work:shops f groceries v34 byte %10.0g v34 Division of hh work: hh cleaning v35 byte %10.0g v35 Division of hh work: preparg the meal v36 byte %10.0g v36 How many hours do you spend on hh work v37 byte %10.0g v37 How many hrs spouse,partner works on hh v38 byte %10.0g v38 Sharing of hh work between the partners v39 byte %10.0g v39 How often disagree abt sharg of hh work v40 byte %10.0g v40 Who makes decisions how to raise kids v41 byte %10.0g v41 Final say: choosing weekend activities v42 byte %10.0g v42 Final say: buying major things for home v43 byte %10.0g v43 Who has the higher income? v44 byte %10.0g v44 So many things to do at home v45 byte %10.0g v45 My life at home is rarely stressful v46 byte %10.0g v46 So many things to do at work v47 byte %10.0g v47 My job is rarely stressful v48 byte %10.0g v48 Too tired from work to do duties at home v49 byte %10.0g v49 Difficult to fulfil fam.responsibility v50 byte %10.0g v50 Too tired from hhwork to function i job v51 byte %10.0g v51 Difficult to concentrate at work v52 byte %10.0g v52 Life in general: how happy on the whole v53 byte %10.0g v53 How satisfied with your main job? v54 byte %10.0g v54 How satisfied with your family life? v55 byte %10.0g v55 Mother ever workg for pay before R 14 v56 byte %10.0g v56 R worked outside: before R has children v57 byte %10.0g v57 R worked outside: kid under school age v58 byte %10.0g v58 R worked outside:youngest kid at school v59 byte %10.0g v59 R worked outside: after kids left home v60 byte %10.0g v60 Spouse worked outside:before children v61 byte %10.0g v61 Sp. work outside: kid under school age v62 byte %10.0g v62 Sp. work outside: young. kid at school v63 byte %10.0g v63 Sp. work outside: after kids left home v64 byte %10.0g v64 Shld women work outside,when not yt kid v65 byte %10.0g v65 How many people in hh: adults 18 yrs + v66 byte %10.0g v66 How many people in hh:kids 6,7 - 17 yrs v67 byte %10.0g v67 Number of people in hh: kids up to 5,6 v68 byte %10.0g v68 Total number of people in household v69 byte %10.0g v69 Total number of children (RP:<18) R had v70 byte %10.0g v70 Spouse degree: highest qualification v71 byte %10.0g v71 Spouse: hours worked weekly v200 byte %10.0g v200 R: Sex v201 byte %10.0g v201 R: Age v202 byte %10.0g v202 R: Marital status v203 byte %10.0g v203 R: Steady life-partner v204 byte %10.0g v204 R: Education I: years of schooling v205 byte %10.0g v205 R: Education II-highest education level v232 byte %10.0g v232 Country specific education: Russia v239 byte %10.0g v239 R: Current employment status v240 byte %10.0g v240 R: Hours worked weekly v241 int %10.0g v241 R: Occupation ILO,ISCO 1988 4-digit v242 byte %10.0g v242 R: Workg f priv.,pub sector, selfempl. v243 int %10.0g v243 R: Self-employed - number of employees v244 byte %10.0g v244 R: Supervises others at work v245 byte %10.0g v245 R: Trade union membership v246 byte %10.0g v246 S-P: Current employment status v247 int %10.0g v247 S-P: Occupation ILO,ISCO 1988 4-digit v249 long %10.0g v249 R: Earnings v250 long %10.0g v250 Family income v251 byte %10.0g v251 How many persons in household v252 byte %10.0g v252 Household composition:children+adults v253 byte %10.0g v253 R: Party affiliation: left-right (der.) v288 int %10.0g v288 R: Religious denomination v289 byte %10.0g v289 R: Religious main groups (derived) v318 byte %10.0g v318 Region: Russia v351 byte %10.0g v351 Size of community: Russia v361 long %10.0g v361 Weighting factor ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- v2 1798 1798 1800900 1800001 1801798 Respondent Number v4 1798 7 2.432703 1 9 Workg mom: warm relation ... v5 1798 7 2.512792 1 9 Workg mom: pre school chi... v6 1798 7 2.528365 1 9 Workg woman: family life ... v7 1798 7 2.885984 1 9 What women really want is... v8 1798 7 3.199666 1 9 Household satisfies as mu... v9 1798 7 2.368187 1 9 Work is best for womens i... v10 1798 7 2.090656 1 9 Both should contribute to... v11 1798 7 2.486652 1 9 Mens job is work,womens j... v12 1798 7 2.650167 1 9 Men should do larger shar... v13 1798 7 2.333148 1 9 Men should do larger shar... v14 1798 5 1.532258 1 9 Shld women work:after mar... v15 1798 5 2.456062 1 9 Shld women work:child und... v16 1798 5 2.100667 1 9 Shld women work:youngest ... v17 1798 5 1.598999 1 9 Shld women work: when kid... v18 1798 7 2.655172 1 9 Marriage: married people ... v19 1798 7 3.927697 1 9 Bad marriage better than ... v20 1798 7 2.662959 1 9 Marriage better,if people... v21 1798 7 3.043382 1 9 Single parent can raise c... v22 1798 7 2.788098 1 9 Couple livg together with... v23 1798 7 2.712458 1 9 Couple live together bef.... v24 1798 7 2.911012 1 9 Divorce best solution w m... v25 1798 7 1.557286 1 9 Children: watching up is ... v26 1798 7 2.748053 1 9 People without kids: lead... v27 1798 7 1.333704 1 9 Workg women shld:paid mat... v28 1798 7 2.309789 1 9 Workg parents shld:financ... v29 1798 8 1.495551 0 9 Organizing income in part... v30 1798 9 1.615684 0 9 Division of hh work: doin... v31 1798 9 1.767519 0 9 Division of hh work: smal... v32 1798 9 1.811457 0 9 Div. hh work:care f sick ... v33 1798 9 1.691324 0 9 Division of hh work:shops... v34 1798 9 1.725806 0 9 Division of hh work: hh c... v35 1798 9 1.644605 0 9 Division of hh work: prep... v36 1798 58 37.75306 1 99 How many hours do you spe... v37 1798 51 22.61735 0 99 How many hrs spouse,partn... v38 1798 8 1.829255 0 9 Sharing of hh work betwee... v39 1798 8 2.322024 0 9 How often disagree abt sh... v40 1798 8 1.914349 0 9 Who makes decisions how t... v41 1798 8 1.838154 0 9 Final say: choosing weeke... v42 1798 8 1.874861 0 9 Final say: buying major t... v43 1798 10 2.418242 0 9 Who has the higher income? v44 1798 6 3.191324 1 8 So many things to do at home v45 1798 6 2.910456 1 8 My life at home is rarely... v46 1798 8 1.939377 0 9 So many things to do at work v47 1798 8 1.794772 0 9 My job is rarely stressful v48 1798 6 1.468854 0 9 Too tired from work to do... v49 1798 6 1.823137 0 9 Difficult to fulfil fam.r... v50 1798 6 2.27475 0 9 Too tired from hhwork to ... v51 1798 6 2.299221 0 9 Difficult to concentrate ... v52 1798 9 3.368187 1 9 Life in general: how happ... v53 1798 10 1.912125 0 9 How satisfied with your m... v54 1798 9 3.594549 1 9 How satisfied with your f... v55 1798 4 1.378198 1 9 Mother ever workg for pay... v56 1798 6 1.259177 0 9 R worked outside: before ... v57 1798 6 1.590656 0 9 R worked outside: kid und... v58 1798 6 2.035595 0 9 R worked outside:youngest... v59 1798 6 3.344271 0 9 R worked outside: after ... v60 1798 6 1.333148 0 9 Spouse worked outside:bef... v61 1798 6 1.73693 0 9 Sp. work outside: kid und... v62 1798 6 2.219689 0 9 Sp. work outside: young. ... v63 1798 6 3.567853 0 9 Sp. work outside: after k... v64 1798 5 1.897664 1 9 Shld women work outside,w... v65 1798 9 1.964405 0 99 How many people in hh: ad... v66 1798 6 .4877642 0 99 How many people in hh:kid... v67 1798 5 .21802 0 99 Number of people in hh: k... v68 1798 11 2.756952 1 99 Total number of people in... v69 1798 13 6.767519 0 99 Total number of children ... v70 1798 8 5.44327 0 9 Spouse degree: highest qu... v71 1798 49 17.69633 0 99 Spouse: hours worked weekly v200 1798 2 1.613459 1 2 R: Sex v201 1798 71 46.87875 18 91 R: Age v202 1798 6 2.177976 1 9 R: Marital status v203 1798 4 .9276974 0 9 R: Steady life-partner v204 1798 26 19.11513 1 99 R: Education I: years of ... v205 1798 7 3.449388 0 9 R: Education II-highest e... v232 1798 8 5.117353 1 99 Country specific educatio... v239 1798 11 5.520578 1 99 R: Current employment status v240 1798 59 25.34705 0 99 R: Hours worked weekly v241 1798 226 3316.383 0 9999 R: Occupation ILO,ISCO 19... v242 1798 7 1.268076 0 9 R: Workg f priv.,pub sect... v243 1798 19 1138.565 0 9999 R: Self-employed - number... v244 1798 6 1.122358 0 9 R: Supervises others at work v245 1798 5 2.237486 1 9 R: Trade union membership v246 1798 14 2.909344 0 99 S-P: Current employment s... v247 1798 171 2478.075 0 9999 S-P: Occupation ILO,ISCO ... v249 1798 193 228793 25 999999 R: Earnings v250 1798 187 232410.9 80 999999 Family income v251 1798 11 2.756952 1 99 How many persons in house... v252 1798 17 5.869299 1 99 Household composition:chi... v253 1798 9 5.408788 1 9 R: Party affiliation: lef... v288 1798 10 494.8943 100 999 R: Religious denomination v289 1798 10 5.517241 1 99 R: Religious main groups ... v318 1798 11 6.715795 1 11 Region: Russia v351 1798 8 4.690768 1 8 Size of community: Russia v361 1798 240 99994.71 13130 788310 Weighting factor -------------------------------------------------------------------------------- . . // #2 . // keeping variables needed in analysis . . use wf-isspru01, clear (Workflow data from Russian ISSP 2002 \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep v2 v4 v5 v6 v7 v8 v9 v200 v201 v202 v204 v232 v239 v249 . describe Contains data from wf-isspru01.dta obs: 1,798 Workflow data from Russian ISSP 2002 \ 2008-04-02 vars: 14 2 Apr 2008 13:29 size: 43,152 (99.9% of memory free) (_dta has notes) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- v2 long %10.0g Respondent Number v4 byte %10.0g v4 Workg mom: warm relation child ok v5 byte %10.0g v5 Workg mom: pre school child suffers v6 byte %10.0g v6 Workg woman: family life suffers v7 byte %10.0g v7 What women really want is home & kids v8 byte %10.0g v8 Household satisfies as much as paid job v9 byte %10.0g v9 Work is best for womens independence v200 byte %10.0g v200 R: Sex v201 byte %10.0g v201 R: Age v202 byte %10.0g v202 R: Marital status v204 byte %10.0g v204 R: Education I: years of schooling v232 byte %10.0g v232 Country specific education: Russia v239 byte %10.0g v239 R: Current employment status v249 long %10.0g v249 R: Earnings ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- v2 1798 1798 1800900 1800001 1801798 Respondent Number v4 1798 7 2.432703 1 9 Workg mom: warm relation ... v5 1798 7 2.512792 1 9 Workg mom: pre school chi... v6 1798 7 2.528365 1 9 Workg woman: family life ... v7 1798 7 2.885984 1 9 What women really want is... v8 1798 7 3.199666 1 9 Household satisfies as mu... v9 1798 7 2.368187 1 9 Work is best for womens i... v200 1798 2 1.613459 1 2 R: Sex v201 1798 71 46.87875 18 91 R: Age v202 1798 6 2.177976 1 9 R: Marital status v204 1798 26 19.11513 1 99 R: Education I: years of ... v232 1798 8 5.117353 1 99 Country specific educatio... v239 1798 11 5.520578 1 99 R: Current employment status v249 1798 193 228793 25 999999 R: Earnings -------------------------------------------------------------------------------- . . // #3 . // adding metadata to a dataset . . label data "Workflow ISSP 2002 Russian data \ 2008-10-24" . note: wf-isspru02.dta \ workflow ch 6 - can delete file \ wf6-save.do jsl 2008 > -10-24. . * reset signature since one is already stored with the data . * in stata 10 and later: datasignature set, reset . save wf-isspru02, replace file wf-isspru02.dta saved . . * check the metadata . use wf-isspru02, clear (Workflow ISSP 2002 Russian data \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta _dta: 1. wf-isspru01.dta \ wf-issp-stataold.dta \ wf-isspru01-support.do jsl 2008-04-02. 2. wf-isspru02.dta \ workflow ch 6 - can delete file \ wf6-save.do jsl 2008-10-24. . . // #4 . // check problems with codebook . . * load the data . use wf-diagnostics, clear (Workflow data to illustrate data diagnostics \ 2008-04-05) . * in stata 10 and later: datasignature confirm . . * check for problems . codebook, problems Potential problems in dataset wf-diagnostics.dta potential problem variables ------------------------------------------------------------------------ constant (or all missing) vars v3 v256 v265 v274 v283 v294 v303 v312 vars with nonexisting label v7 incompletely labeled vars v36 v37 ------------------------------------------------------------------------ . . * check variable without variation . tab1 v3, miss -> tabulation of v3 Country | Freq. Percent Cum. ------------+----------------------------------- RUS | 100 100.00 100.00 ------------+----------------------------------- Total | 100 100.00 . tab1 v256, miss -> tabulation of v256 R: Party affiliation: | Bulgaria | Freq. Percent Cum. -----------------------+----------------------------------- NAV | 100 100.00 100.00 -----------------------+----------------------------------- Total | 100 100.00 . . * check missing labels . describe v7 storage display value variable name type format label variable label ------------------------------------------------------------------------------- v7 byte %10.0g labv7 What women really want is home & kids . tab1 v7, miss -> tabulation of v7 What women | really want | is home & | kids | Freq. Percent Cum. ------------+----------------------------------- 1 | 21 21.00 21.00 2 | 33 33.00 54.00 3 | 27 27.00 81.00 4 | 14 14.00 95.00 5 | 2 2.00 97.00 8 | 3 3.00 100.00 ------------+----------------------------------- Total | 100 100.00 . . * check incomplete labels . tab1 v37, miss -> tabulation of v37 How many hrs spouse,partner | works on hh | Freq. Percent Cum. -----------------------------+----------------------------------- NAP,no partner | 50 50.00 50.00 1 hour or less than 1 hr | 1 1.00 51.00 2 hrs | 1 1.00 52.00 | 2 2.00 54.00 7 | 6 6.00 60.00 8 | 1 1.00 61.00 9 | 1 1.00 62.00 | 4 4.00 66.00 14 | 3 3.00 69.00 15 | 4 4.00 73.00 | 5 5.00 78.00 24 | 1 1.00 79.00 29 | 1 1.00 80.00 | 5 5.00 85.00 35 | 1 1.00 86.00 40 | 2 2.00 88.00 45 | 1 1.00 89.00 49 | 1 1.00 90.00 63 | 1 1.00 91.00 70 | 1 1.00 92.00 None, no hour | 1 1.00 93.00 Dont know,cant say | 6 6.00 99.00 Na | 1 1.00 100.00 -----------------------------+----------------------------------- Total | 100 100.00 . tab1 v37, miss nol -> tabulation of v37 How many | hrs | spouse,part | ner works | on hh | Freq. Percent Cum. ------------+----------------------------------- 0 | 50 50.00 50.00 1 | 1 1.00 51.00 2 | 1 1.00 52.00 3 | 2 2.00 54.00 7 | 6 6.00 60.00 8 | 1 1.00 61.00 9 | 1 1.00 62.00 10 | 4 4.00 66.00 14 | 3 3.00 69.00 15 | 4 4.00 73.00 20 | 5 5.00 78.00 24 | 1 1.00 79.00 29 | 1 1.00 80.00 30 | 5 5.00 85.00 35 | 1 1.00 86.00 40 | 2 2.00 88.00 45 | 1 1.00 89.00 49 | 1 1.00 90.00 63 | 1 1.00 91.00 70 | 1 1.00 92.00 96 | 1 1.00 93.00 98 | 6 6.00 99.00 99 | 1 1.00 100.00 ------------+----------------------------------- Total | 100 100.00 . . // #5 . // check for duplicates . . * isid is commented out since it generates an error that stops the program . * use wf-diagnostics, clear . * isid id . . * check duplicates with duplicates command . duplicates report id Duplicates in terms of id -------------------------------------- copies | observations surplus ----------+--------------------------- 1 | 98 0 2 | 2 1 -------------------------------------- . duplicates examples id, clean Duplicates in terms of id # e.g. obs id 2 1 1800007 . . log close log: D:\wf\work\wf6-save.log log type: text closed on: 24 Oct 2008, 09:41:42 -------------------------------------------------------------------------------- . exit end of do-file . . * example . do wf6-create01-controls.do, nostop // so cf doesn't end do file . capture log close . log using wf6-create01-controls, replace text (note: file D:\wf\work\wf6-create01-controls.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-create01-controls.log log type: text opened on: 24 Oct 2008, 09:41:42 . . // program: wf6-create01-controls.do \ for stata 9 - step 1 of 3 . // task: Create control variables for ISSP Russian data . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define local . . local date "2008-10-24" . local tag "wf6-create01.do jsl `date'." . . // #2 . // load data . . use wf-russia01, replace (Workflow data to illustrate creating variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #3 . // create controls for demographic variables . . gen female = gender - 1 . label var female "Female?" . label def female 0 0_male 1 1_female . label val female female . note female: based on gender \ `tag' . tab gender female, miss Gender: | 1=male, | Female? 2=female | 0_male 1_female | Total ---------------+----------------------+---------- 1. Male | 695 0 | 695 2. Female | 0 1,103 | 1,103 ---------------+----------------------+---------- Total | 695 1,103 | 1,798 . . gen male = 1 - female . label var male "Male?" . label def male 1 1_male 0 0_female . label val male male . note male: based on gender \ `tag' . tab gender male, miss Gender: | 1=male, | Male? 2=female | 0_female 1_male | Total ---------------+----------------------+---------- 1. Male | 0 695 | 695 2. Female | 1,103 0 | 1,103 ---------------+----------------------+---------- Total | 1,103 695 | 1,798 . . recode marstat (1 2 3 4=1) (5=0), gen(married) (848 differences between marstat and married) . label def married 1 1_married 0 0_never . label val married married . label var married "Ever married?" . note married: recoding of marstat \ married includes married, /// > widowed, divorced, separated \ `tag' . . tab marstat married, miss Marital | Ever married? status | 0_never 1_married .b | Total ------------+---------------------------------+---------- 1_married | 0 931 0 | 931 2_widowed | 0 321 0 | 321 3_divorced | 0 215 0 | 215 4_separated | 0 33 0 | 33 5_single | 279 0 0 | 279 .b | 0 0 19 | 19 ------------+---------------------------------+---------- Total | 279 1,500 19 | 1,798 . . recode edlevel (1 2 3 4 5=0) (6 7=1) (99=.n), gen(hidegree) (1795 differences between edlevel and hidegree) . label var hidegree "Any higher education?" . label def hidegree 0 0_not 1 1_high_ed . label val hidegree hidegree . note hidegree: recode of edlevel \ `tag' . tab edlevel hidegree, miss | Any higher education? Education level | 0_not 1_high_ed .b | Total ----------------------+---------------------------------+---------- 1. None,still at scho | 90 0 0 | 90 2. Incomplete primary | 8 0 0 | 8 3. Primary completed | 102 0 0 | 102 4. Incomplete seconda | 236 0 0 | 236 5. Secondary complete | 955 0 0 | 955 6. Semi-higher,incomp | 0 55 0 | 55 7. University complet | 0 349 0 | 349 b_Refused | 0 0 3 | 3 ----------------------+---------------------------------+---------- Total | 1,391 404 3 | 1,798 . . recode empstat (1 7=1) (2 3 5 6 8 9 10=0) (98=.d) (99=.n), gen(fulltime) (910 differences between empstat and fulltime) . label def fulltime 1 1_fulltime 0 0_not . label val fulltime fulltime . label var fulltime "Ever worked full time?" . note fulltime: recoding of empstat; includes fulltime & retired \ `tag' . tab empstat fulltime, miss Current employment | Ever worked full time? status | 0_not 1_fulltim .b .c | Total ----------------------+--------------------------------------------+---------- 1. Employed-full time | 0 855 0 0 | 855 2. Employed-part time | 87 0 0 0 | 87 3. Empl-< part-time | 35 0 0 0 | 35 5. Unemployed | 69 0 0 0 | 69 6. Studt,school,vocat | 59 0 0 0 | 59 7. Retired | 0 531 0 0 | 531 8. Housewife,home dut | 72 0 0 0 | 72 9. Permanently disabl | 34 0 0 0 | 34 10. Oth,not i labour | 23 0 0 0 | 23 b_Refused | 0 0 30 0 | 30 c_DontKnow | 0 0 0 3 | 3 ----------------------+--------------------------------------------+---------- Total | 379 1,386 30 3 | 1,798 . . // #4 . // check new variables . . codebook female-fulltime, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1798 2 .6134594 0 1 Female? male 1798 2 .3865406 0 1 Male? married 1779 2 .8431703 0 1 Ever married? hidegree 1795 2 .2250696 0 1 Any higher education? fulltime 1765 2 .7852691 0 1 Ever worked full time? -------------------------------------------------------------------------------- . . // #5 . // cleanup and save . . sort id . quietly compress . label data "Workflow example of adding analysis variables \ `date'" . note: wf-russia02.dta \ `tag' . * in stata 10 and later: datasignature set, reset . save wf-russia02, replace file wf-russia02.dta saved . . * verify data that was saved . use wf-russia02, clear (Workflow example of adding analysis variables \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta: 1. wf-russia01.dta \ wf-isspru01.dta \ wf-russia01-support.do jsl 2008-04-02. 2. wf-russia02.dta \ wf6-create01.do jsl 2008-10-24. id: 1. clone of v2 wf-russia01-support.do jsl 2008-04-02. momwarm: 1. clone of v4 wf-russia01-support.do jsl 2008-04-02. 2. low values == pro working women. kidsuffer: 1. clone of v5 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. famsuffer: 1. clone of v6 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. wanthome: 1. clone of v7 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. housesat: 1. clone of v8 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. workbest: 1. clone of v9 wf-russia01-support.do jsl 2008-04-02. 2. low values == pro working women. gender: 1. clone of v200 wf-russia01-support.do jsl 2008-04-02. age: 1. clone of v201 wf-russia01-support.do jsl 2008-04-02. marstat: 1. clone of v202 wf-russia01-support.do jsl 2008-04-02. edyears: 1. copy of v204 wf-russia01-support.do jsl 2008-04-02. edlevel: 1. clone of v232 wf-russia01-support.do jsl 2008-04-02. empstat: 1. clone of v239 wf-russia01-support.do jsl 2008-04-02. earnings: 1. clone of v249 wf-russia01-support.do jsl 2008-04-02. female: 1. based on gender \ wf6-create01.do jsl 2008-10-24. male: 1. based on gender \ wf6-create01.do jsl 2008-10-24. married: 1. recoding of marstat \ married includes married, widowed, divorced, separated \ wf6-create01.do jsl 2008-10-24. hidegree: 1. recode of edlevel \ wf6-create01.do jsl 2008-10-24. fulltime: 1. recoding of empstat; includes fulltime & retired \ wf6-create01.do jsl 2008-10-24. . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id 1798 1798 1800900 1800001 1801798 Respondent number momwarm 1765 5 2.324646 1 5 Working mom can have warm... kidsuffer 1755 5 2.373789 1 5 Pre-school child suffers? famsuffer 1759 5 2.401933 1 5 Family life suffers? wanthome 1717 5 2.639487 1 5 Women really want is home... housesat 1680 5 2.855357 1 5 Housework satisfies like ... workbest 1710 5 2.071345 1 5 Work best for women's ind... gender 1798 2 1.613459 1 2 Gender: 1=male, 2=female age 1798 71 46.87875 18 91 Age in years marstat 1779 5 2.105115 1 5 Marital status edyears 1639 21 11.57169 1 21 Years of schooling edlevel 1795 7 4.960446 1 7 Education level empstat 1765 9 3.774504 1 10 Current employment status earnings 1390 190 2424.938 25 90000 Earnings female 1798 2 .6134594 0 1 Female? male 1798 2 .3865406 0 1 Male? married 1779 2 .8431703 0 1 Ever married? hidegree 1795 2 .2250696 0 1 Any higher education? fulltime 1765 2 .7852691 0 1 Ever worked full time? -------------------------------------------------------------------------------- . cf _all using wf-russia01 id: 1795 mismatches momwarm: 1344 mismatches kidsuffer: 1294 mismatches famsuffer: 1333 mismatches wanthome: 1390 mismatches housesat: 1408 mismatches workbest: 1269 mismatches gender: 844 mismatches age: 1769 mismatches marstat: 1172 mismatches edyears: 1648 mismatches edlevel: 1201 mismatches empstat: 1229 mismatches earnings: 1731 mismatches female: does not exist in using male: does not exist in using married: does not exist in using hidegree: does not exist in using fulltime: does not exist in using r(9); . . log close log: D:\wf\work\wf6-create01-controls.log log type: text closed on: 24 Oct 2008, 09:41:42 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-create02-binary.do, nostop // so cf doesn't end do file . capture log close . log using wf6-create02-binary, replace text (note: file D:\wf\work\wf6-create02-binary.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-create02-binary.log log type: text opened on: 24 Oct 2008, 09:41:42 . . // program: wf6-create02-binary.do \ for stata 9 - step 2 of 3 . // task: Create variables for ISSP data . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define local . . local date "2008-10-24" . local tag "wf6-create02.do jsl `date'." . . // #2 . // load data . . use wf-russia02, clear (Workflow example of adding analysis variables \ 2008-10-24) . * in stata 10 and later: datasignature confirm . . // #3 . // create binary indicators . . codebook momwarm kidsuffer famsuffer wanthome housesat workbest, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- momwarm 1765 5 2.324646 1 5 Working mom can have warm relatio... kidsuffer 1755 5 2.373789 1 5 Pre-school child suffers? famsuffer 1759 5 2.401933 1 5 Family life suffers? wanthome 1717 5 2.639487 1 5 Women really want is home & kids? housesat 1680 5 2.855357 1 5 Housework satisfies like paid job? workbest 1710 5 2.071345 1 5 Work best for women's independence? -------------------------------------------------------------------------------- . nmlab momwarm kidsuffer famsuffer wanthome housesat workbest momwarm Working mom can have warm relations w kids? kidsuffer Pre-school child suffers? famsuffer Family life suffers? wanthome Women really want is home & kids? housesat Housework satisfies like paid job? workbest Work best for women's independence? . sum momwarm kidsuffer famsuffer wanthome housesat workbest Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- momwarm | 1765 2.324646 1.147554 1 5 kidsuffer | 1755 2.373789 1.050274 1 5 famsuffer | 1759 2.401933 1.098358 1 5 wanthome | 1717 2.639487 1.111983 1 5 housesat | 1680 2.855357 1.101694 1 5 -------------+-------------------------------------------------------- workbest | 1710 2.071345 .951515 1 5 . tab1 momwarm, miss -> tabulation of momwarm Working mom | can have warm | relations w | kids? | Freq. Percent Cum. ---------------+----------------------------------- 1StAgree | 464 25.81 25.81 2Agree | 712 39.60 65.41 3Neither | 197 10.96 76.36 4Disagree | 336 18.69 95.05 5StDisagree | 56 3.11 98.16 a_Can't choose | 26 1.45 99.61 b_Refused | 7 0.39 100.00 ---------------+----------------------------------- Total | 1,798 100.00 . . * check direction of coding . pwcorr momwarm kidsuffer famsuffer wanthome housesat workbest, obs | momwarm kidsuf~r famsuf~r wanthome housesat workbest -------------+------------------------------------------------------ momwarm | 1.0000 | 1765 | kidsuffer | -0.2494 1.0000 | 1736 1755 | famsuffer | -0.2517 0.5767 1.0000 | 1737 1738 1759 | wanthome | -0.1069 0.2357 0.2977 1.0000 | 1698 1688 1698 1717 | housesat | -0.0148 0.1465 0.1921 0.4133 1.0000 | 1664 1657 1662 1649 1680 | workbest | 0.0624 -0.0220 -0.0717 -0.1369 -0.2019 1.0000 | 1691 1684 1690 1659 1636 1710 | . . * new value labels . label def Lagree 1 1_agree 0 0_not .a a_Unsure /// > .b b_Refused .n n_Neutral . label def Lprowork 1 1_yesPos 0 0_noNeg .a a_Unsure /// > .b b_Refused .n n_Neutral . . * momwarm: 1=SA working mom can have warm relationship . * Bwarm: 1=agree (not reversed) . recode momwarm (1/2=1) (4/5=0) (3=.n), gen(Bwarm) (1301 differences between momwarm and Bwarm) . label var Bwarm "Working mom can have warm relations?" . label val Bwarm Lprowork . note Bwarm: 3=neutral in source was coded .n \ `tag' . tab Bwarm momwarm, miss Working | mom can | have warm | Working mom can have warm relations w kids? relations? | 1StAgree 2Agree 3Neither 4Disagree 5StDisagr | Total -----------+-------------------------------------------------------+---------- 0_noNeg | 0 0 0 336 56 | 392 1_yesPos | 464 712 0 0 0 | 1,176 a_Unsure | 0 0 0 0 0 | 26 b_Refused | 0 0 0 0 0 | 7 n_Neutral | 0 0 197 0 0 | 197 -----------+-------------------------------------------------------+---------- Total | 464 712 197 336 56 | 1,798 Working | Working mom can have mom can | warm relations w have warm | kids? relations? | a_Can't c b_Refused | Total -----------+----------------------+---------- 0_noNeg | 0 0 | 392 1_yesPos | 0 0 | 1,176 a_Unsure | 26 0 | 26 b_Refused | 0 7 | 7 n_Neutral | 0 0 | 197 -----------+----------------------+---------- Total | 26 7 | 1,798 . . * kidsuffer: 1=SA preschool child suffers with working mom . * Bkids: 1=agree don't suffer (reverse coding) . recode kidsuffer (1/2=0) (4/5=1) (3=.n), gen(Bkids) (1755 differences between kidsuffer and Bkids) . label var Bkids "Agree kids don't suffer with working mom?" . label val Bkids Lprowork . note Bkids: 3=neutral in source was coded .n \ `tag' . tab kidsuffer Bkids, miss Pre-school | Agree kids don't suffer with working mom? child suffers? | 0_noNeg 1_yesPos a_Unsure b_Refused | Total ---------------+--------------------------------------------+---------- 1StAgree | 343 0 0 0 | 343 2Agree | 795 0 0 0 | 795 3Neither | 0 0 0 0 | 272 4Disagree | 0 308 0 0 | 308 5StDisagree | 0 37 0 0 | 37 a_Can't choose | 0 0 35 0 | 35 b_Refused | 0 0 0 8 | 8 ---------------+--------------------------------------------+---------- Total | 1,138 345 35 8 | 1,798 | Agree kids | don't | suffer | with | working Pre-school | mom? child suffers? | n_Neutral | Total ---------------+-----------+---------- 1StAgree | 0 | 343 2Agree | 0 | 795 3Neither | 272 | 272 4Disagree | 0 | 308 5StDisagree | 0 | 37 a_Can't choose | 0 | 35 b_Refused | 0 | 8 ---------------+-----------+---------- Total | 272 | 1,798 . . * famsuffer: 1=SA family suffers with working mom . * Bfamily: 1=agree don't suffer (reverse coding) . recode famsuffer (1/2=0) (4/5=1) (3=.n), gen(Bfamily) (1759 differences between famsuffer and Bfamily) . label var Bfamily "Agree family life doesn't suffer?" . label val Bfamily Lprowork . note Bfamily: 3=neutral in source was coded .n \ `tag' . tab famsuffer Bfamily, miss Family life | Agree family life doesn't suffer? suffers? | 0_noNeg 1_yesPos a_Unsure b_Refused | Total ---------------+--------------------------------------------+---------- 1StAgree | 373 0 0 0 | 373 2Agree | 732 0 0 0 | 732 3Neither | 0 0 0 0 | 278 4Disagree | 0 326 0 0 | 326 5StDisagree | 0 50 0 0 | 50 a_Can't choose | 0 0 30 0 | 30 b_Refused | 0 0 0 9 | 9 ---------------+--------------------------------------------+---------- Total | 1,105 376 30 9 | 1,798 | Agree | family | life | doesn't Family life | suffer? suffers? | n_Neutral | Total ---------------+-----------+---------- 1StAgree | 0 | 373 2Agree | 0 | 732 3Neither | 278 | 278 4Disagree | 0 | 326 5StDisagree | 0 | 50 a_Can't choose | 0 | 30 b_Refused | 0 | 9 ---------------+-----------+---------- Total | 278 | 1,798 . . * wanthome: 1=SA really wants to stay home . * Bnohome: 1=agree don't want home (reverse coding) . recode wanthome (1/2=0) (4/5=1) (3=.n), gen(Bnohome) (1717 differences between wanthome and Bnohome) . label var Bnohome "Agree women don't want home and kids?" . label val Bnohome Lprowork . note Bnohome: 3=neutral in source was coded .n \ `tag' . tab wanthome Bnohome, miss Women really | want is home & | Agree women don't want home and kids? kids? | 0_noNeg 1_yesPos a_Unsure b_Refused | Total ---------------+--------------------------------------------+---------- 1StAgree | 268 0 0 0 | 268 2Agree | 615 0 0 0 | 615 3Neither | 0 0 0 0 | 365 4Disagree | 0 406 0 0 | 406 5StDisagree | 0 63 0 0 | 63 a_Can't choose | 0 0 72 0 | 72 b_Refused | 0 0 0 9 | 9 ---------------+--------------------------------------------+---------- Total | 883 469 72 9 | 1,798 | Agree | women | don't want Women really | home and want is home & | kids? kids? | n_Neutral | Total ---------------+-----------+---------- 1StAgree | 0 | 268 2Agree | 0 | 615 3Neither | 365 | 365 4Disagree | 0 | 406 5StDisagree | 0 | 63 a_Can't choose | 0 | 72 b_Refused | 0 | 9 ---------------+-----------+---------- Total | 365 | 1,798 . . * housesat: 1=SA house just as satisfying . * Bjobsat: 1=agree job is satisfying (reverse coding) . recode housesat (1/2=0) (4/5=1) (3=.n), gen(Bjobsat) (1680 differences between housesat and Bjobsat) . label var Bjobsat "Agree paid job satisfies more?" . label val Bjobsat Lprowork . note Bjobsat: 3=neutral in source was coded .n \ `tag' . tab housesat Bjobsat, miss Housework | satisfies like | Agree paid job satisfies more? paid job? | 0_noNeg 1_yesPos a_Unsure b_Refused | Total ---------------+--------------------------------------------+---------- 1StAgree | 190 0 0 0 | 190 2Agree | 503 0 0 0 | 503 3Neither | 0 0 0 0 | 432 4Disagree | 0 470 0 0 | 470 5StDisagree | 0 85 0 0 | 85 a_Can't choose | 0 0 106 0 | 106 b_Refused | 0 0 0 12 | 12 ---------------+--------------------------------------------+---------- Total | 693 555 106 12 | 1,798 | Agree paid | job Housework | satisfies satisfies like | more? paid job? | n_Neutral | Total ---------------+-----------+---------- 1StAgree | 0 | 190 2Agree | 0 | 503 3Neither | 432 | 432 4Disagree | 0 | 470 5StDisagree | 0 | 85 a_Can't choose | 0 | 106 b_Refused | 0 | 12 ---------------+-----------+---------- Total | 432 | 1,798 . . * workbest: 1=SA work is best for independence . * Bindep: 1=agree job gives indep (not reversed) . recode workbest (1/2=1) (4/5=0) (3=.n), gen(Bindep) (1205 differences between workbest and Bindep) . label var Bindep "Agree work creates independence?" . label val Bindep Lprowork . note Bindep: 3=neutral in source was coded .n \ `tag' . tab workbest Bindep, miss Work best for | women's | Agree work creates independence? independence? | 0_noNeg 1_yesPos a_Unsure b_Refused | Total ---------------+--------------------------------------------+---------- 1StAgree | 0 505 0 0 | 505 2Agree | 0 767 0 0 | 767 3Neither | 0 0 0 0 | 272 4Disagree | 143 0 0 0 | 143 5StDisagree | 23 0 0 0 | 23 a_Can't choose | 0 0 76 0 | 76 b_Refused | 0 0 0 12 | 12 ---------------+--------------------------------------------+---------- Total | 166 1,272 76 12 | 1,798 | Agree work | creates Work best for | independen women's | ce? independence? | n_Neutral | Total ---------------+-----------+---------- 1StAgree | 0 | 505 2Agree | 0 | 767 3Neither | 272 | 272 4Disagree | 0 | 143 5StDisagree | 0 | 23 a_Can't choose | 0 | 76 b_Refused | 0 | 12 ---------------+-----------+---------- Total | 272 | 1,798 . . // #4 . // check that all are coded in same direction . . codebook B*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- Bwarm 1568 2 .75 0 1 Working mom can have warm relations? Bkids 1483 2 .2326365 0 1 Agree kids don't suffer with work... Bfamily 1481 2 .2538825 0 1 Agree family life doesn't suffer? Bnohome 1352 2 .3468935 0 1 Agree women don't want home and k... Bjobsat 1248 2 .4447115 0 1 Agree paid job satisfies more? Bindep 1438 2 .8845619 0 1 Agree work creates independence? -------------------------------------------------------------------------------- . pwcorr B*, obs | Bwarm Bkids Bfamily Bnohome Bjobsat Bindep -------------+------------------------------------------------------ Bwarm | 1.0000 | 1568 | Bkids | 0.2351 1.0000 | 1311 1483 | Bfamily | 0.2289 0.5775 1.0000 | 1314 1312 1481 | Bnohome | 0.1382 0.2327 0.2850 1.0000 | 1208 1157 1161 1352 | Bjobsat | 0.0396 0.1446 0.1672 0.4659 1.0000 | 1119 1071 1071 1033 1248 | Bindep | 0.0345 0.0185 0.0546 0.1048 0.1538 1.0000 | 1279 1217 1211 1122 1058 1438 | . . // #5 . // cleanup and save . . sort id . quietly compress . label data "Workflow example of adding analysis variables \ `date'" . note: x-wf6-create02-binary.dta \ `tag' . * in stata 10 and later: datasignature set, reset . save x-wf6-create02-binary, replace file x-wf6-create02-binary.dta saved . . clear . use x-wf6-create02-binary (Workflow example of adding analysis variables \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta: 1. wf-russia01.dta \ wf-isspru01.dta \ wf-russia01-support.do jsl 2008-04-02. 2. wf-russia02.dta \ wf6-create01.do jsl 2008-10-24. 3. x-wf6-create02-binary.dta \ wf6-create02.do jsl 2008-10-24. id: 1. clone of v2 wf-russia01-support.do jsl 2008-04-02. momwarm: 1. clone of v4 wf-russia01-support.do jsl 2008-04-02. 2. low values == pro working women. kidsuffer: 1. clone of v5 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. famsuffer: 1. clone of v6 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. wanthome: 1. clone of v7 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. housesat: 1. clone of v8 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. workbest: 1. clone of v9 wf-russia01-support.do jsl 2008-04-02. 2. low values == pro working women. gender: 1. clone of v200 wf-russia01-support.do jsl 2008-04-02. age: 1. clone of v201 wf-russia01-support.do jsl 2008-04-02. marstat: 1. clone of v202 wf-russia01-support.do jsl 2008-04-02. edyears: 1. copy of v204 wf-russia01-support.do jsl 2008-04-02. edlevel: 1. clone of v232 wf-russia01-support.do jsl 2008-04-02. empstat: 1. clone of v239 wf-russia01-support.do jsl 2008-04-02. earnings: 1. clone of v249 wf-russia01-support.do jsl 2008-04-02. female: 1. based on gender \ wf6-create01.do jsl 2008-10-24. male: 1. based on gender \ wf6-create01.do jsl 2008-10-24. married: 1. recoding of marstat \ married includes married, widowed, divorced, separated \ wf6-create01.do jsl 2008-10-24. hidegree: 1. recode of edlevel \ wf6-create01.do jsl 2008-10-24. fulltime: 1. recoding of empstat; includes fulltime & retired \ wf6-create01.do jsl 2008-10-24. Bwarm: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bkids: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bfamily: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bnohome: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bjobsat: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bindep: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id 1798 1798 1800900 1800001 1801798 Respondent number momwarm 1765 5 2.324646 1 5 Working mom can have warm... kidsuffer 1755 5 2.373789 1 5 Pre-school child suffers? famsuffer 1759 5 2.401933 1 5 Family life suffers? wanthome 1717 5 2.639487 1 5 Women really want is home... housesat 1680 5 2.855357 1 5 Housework satisfies like ... workbest 1710 5 2.071345 1 5 Work best for women's ind... gender 1798 2 1.613459 1 2 Gender: 1=male, 2=female age 1798 71 46.87875 18 91 Age in years marstat 1779 5 2.105115 1 5 Marital status edyears 1639 21 11.57169 1 21 Years of schooling edlevel 1795 7 4.960446 1 7 Education level empstat 1765 9 3.774504 1 10 Current employment status earnings 1390 190 2424.938 25 90000 Earnings female 1798 2 .6134594 0 1 Female? male 1798 2 .3865406 0 1 Male? married 1779 2 .8431703 0 1 Ever married? hidegree 1795 2 .2250696 0 1 Any higher education? fulltime 1765 2 .7852691 0 1 Ever worked full time? Bwarm 1568 2 .75 0 1 Working mom can have warm... Bkids 1483 2 .2326365 0 1 Agree kids don't suffer w... Bfamily 1481 2 .2538825 0 1 Agree family life doesn't... Bnohome 1352 2 .3468935 0 1 Agree women don't want ho... Bjobsat 1248 2 .4447115 0 1 Agree paid job satisfies ... Bindep 1438 2 .8845619 0 1 Agree work creates indepe... -------------------------------------------------------------------------------- . . // #6 . // check the changes . . cf _all using wf-russia02.dta id: 1798 mismatches momwarm: 1301 mismatches kidsuffer: 1274 mismatches famsuffer: 1320 mismatches wanthome: 1380 mismatches housesat: 1380 mismatches workbest: 1280 mismatches gender: 856 mismatches age: 1766 mismatches marstat: 1191 mismatches edyears: 1651 mismatches edlevel: 1164 mismatches empstat: 1209 mismatches earnings: 1712 mismatches female: 856 mismatches male: 856 mismatches married: 494 mismatches hidegree: 596 mismatches fulltime: 646 mismatches Bwarm: does not exist in using Bkids: does not exist in using Bfamily: does not exist in using Bnohome: does not exist in using Bjobsat: does not exist in using Bindep: does not exist in using r(9); . . log close log: D:\wf\work\wf6-create02-binary.log log type: text closed on: 24 Oct 2008, 09:41:43 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-create03-noneutral.do, nostop // so cf doesn't end do file . capture log close . log using wf6-create03-noneutral, replace text (note: file D:\wf\work\wf6-create03-noneutral.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-create03-noneutral.log log type: text opened on: 24 Oct 2008, 09:41:43 . . // program: wf6-create03-noneutral.do \ for stata 9 - step 3 of 3 . // task: Create variables for ISSP data . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define local . . local tag "wf6-create03-noneutral.do jsl 2008-10-24." . . // #2 . // load data . . use x-wf6-create02-binary, replace (Workflow example of adding analysis variables \ 2008-10-24) . * in stata 10 and later: datasignature confirm . . // #3 . // create ordinal outcomes without neutral . // note: this shows how to use local macros for this . . * new labels . label def Lsa_sd 1 1_SA_Pos 2 2_A_Pos 3 3_D_Neg /// > 4 4_SD_Neg .a a_Unsure .b b_Refused .n n_Neutral . . * momwarm: 1=SA working mom can have warm relationship . * C4warm: 1=SA (not reversed) . local vin momwarm . local vout C4warm . recode `vin' (1=1) (2=2) (3=.n) (4=3) (5=4), gen(`vout') (589 differences between momwarm and C4warm) . label var `vout' "Working mom can have warm relations?" . label val `vout' Lsa_sd . note `vout': 3=neutral in source was coded .n \ `tag' . tab `vin' `vout', m Working mom | can have warm | relations w | Working mom can have warm relations? kids? | 1_SA_Pos 2_A_Pos 3_D_Neg 4_SD_Neg | Total ---------------+--------------------------------------------+---------- 1StAgree | 464 0 0 0 | 464 2Agree | 0 712 0 0 | 712 3Neither | 0 0 0 0 | 197 4Disagree | 0 0 336 0 | 336 5StDisagree | 0 0 0 56 | 56 a_Can't choose | 0 0 0 0 | 26 b_Refused | 0 0 0 0 | 7 ---------------+--------------------------------------------+---------- Total | 464 712 336 56 | 1,798 Working mom | can have warm | Working mom can have warm relations w | relations? kids? | a_Unsure b_Refused n_Neutral | Total ---------------+---------------------------------+---------- 1StAgree | 0 0 0 | 464 2Agree | 0 0 0 | 712 3Neither | 0 0 197 | 197 4Disagree | 0 0 0 | 336 5StDisagree | 0 0 0 | 56 a_Can't choose | 26 0 0 | 26 b_Refused | 0 7 0 | 7 ---------------+---------------------------------+---------- Total | 26 7 197 | 1,798 . . * kidsuffer: 1=SA preschool child suffers with working mom . * C4kids: 1=SA don't suffer (reverse coding) . local vin kidsuffer . local vout C4kids // reverse coding . recode `vin' (1=4) (2=3) (3=.n) (4=2) (5=1), gen(`vout') (1755 differences between kidsuffer and C4kids) . label var `vout' "Kids don't suffer with working mom?" . label val `vout' Lsa_sd . note `vout': 3=neutral in source was coded .n \ `tag' . tab `vin' `vout', m Pre-school | Kids don't suffer with working mom? child suffers? | 1_SA_Pos 2_A_Pos 3_D_Neg 4_SD_Neg | Total ---------------+--------------------------------------------+---------- 1StAgree | 0 0 0 343 | 343 2Agree | 0 0 795 0 | 795 3Neither | 0 0 0 0 | 272 4Disagree | 0 308 0 0 | 308 5StDisagree | 37 0 0 0 | 37 a_Can't choose | 0 0 0 0 | 35 b_Refused | 0 0 0 0 | 8 ---------------+--------------------------------------------+---------- Total | 37 308 795 343 | 1,798 | Kids don't suffer with working Pre-school | mom? child suffers? | a_Unsure b_Refused n_Neutral | Total ---------------+---------------------------------+---------- 1StAgree | 0 0 0 | 343 2Agree | 0 0 0 | 795 3Neither | 0 0 272 | 272 4Disagree | 0 0 0 | 308 5StDisagree | 0 0 0 | 37 a_Can't choose | 35 0 0 | 35 b_Refused | 0 8 0 | 8 ---------------+---------------------------------+---------- Total | 35 8 272 | 1,798 . . * famsuffer: 1=SA family suffers with working mom . * C4family: 1=SA don't suffer (reverse coding) . local vin famsuffer . local vout C4family . recode `vin' (1=4) (2=3) (3=.n) (4=2) (5=1), gen(`vout') (1759 differences between famsuffer and C4family) . label var `vout' "Family life doesn't suffer?" . label val `vout' Lsa_sd . note `vout': 3=neutral in source was coded .n \ `tag' . tab `vin' `vout', m Family life | Family life doesn't suffer? suffers? | 1_SA_Pos 2_A_Pos 3_D_Neg 4_SD_Neg | Total ---------------+--------------------------------------------+---------- 1StAgree | 0 0 0 373 | 373 2Agree | 0 0 732 0 | 732 3Neither | 0 0 0 0 | 278 4Disagree | 0 326 0 0 | 326 5StDisagree | 50 0 0 0 | 50 a_Can't choose | 0 0 0 0 | 30 b_Refused | 0 0 0 0 | 9 ---------------+--------------------------------------------+---------- Total | 50 326 732 373 | 1,798 Family life | Family life doesn't suffer? suffers? | a_Unsure b_Refused n_Neutral | Total ---------------+---------------------------------+---------- 1StAgree | 0 0 0 | 373 2Agree | 0 0 0 | 732 3Neither | 0 0 278 | 278 4Disagree | 0 0 0 | 326 5StDisagree | 0 0 0 | 50 a_Can't choose | 30 0 0 | 30 b_Refused | 0 9 0 | 9 ---------------+---------------------------------+---------- Total | 30 9 278 | 1,798 . . * wanthome: 1=SA really wants to stay home . * C4nohome: 1=SA don't want home (reverse coding) . local vin wanthome . local vout C4nohome . recode `vin' (1=4) (2=3) (3=.n) (4=2) (5=1), gen(`vout') (1717 differences between wanthome and C4nohome) . label var `vout' "Agree women don't want home and kids?" . label val `vout' Lsa_sd . note `vout': 3=neutral in source was coded .n \ `tag' . tab `vin' `vout', m Women really | want is home & | Agree women don't want home and kids? kids? | 1_SA_Pos 2_A_Pos 3_D_Neg 4_SD_Neg | Total ---------------+--------------------------------------------+---------- 1StAgree | 0 0 0 268 | 268 2Agree | 0 0 615 0 | 615 3Neither | 0 0 0 0 | 365 4Disagree | 0 406 0 0 | 406 5StDisagree | 63 0 0 0 | 63 a_Can't choose | 0 0 0 0 | 72 b_Refused | 0 0 0 0 | 9 ---------------+--------------------------------------------+---------- Total | 63 406 615 268 | 1,798 Women really | Agree women don't want home and want is home & | kids? kids? | a_Unsure b_Refused n_Neutral | Total ---------------+---------------------------------+---------- 1StAgree | 0 0 0 | 268 2Agree | 0 0 0 | 615 3Neither | 0 0 365 | 365 4Disagree | 0 0 0 | 406 5StDisagree | 0 0 0 | 63 a_Can't choose | 72 0 0 | 72 b_Refused | 0 9 0 | 9 ---------------+---------------------------------+---------- Total | 72 9 365 | 1,798 . . * housesat: 1=SA house just as satisfying . * C4jobsat: 1=SA job is satisfying (reverse coding) . local vin housesat . local vout C4jobsat . recode `vin' (1=4) (2=3) (3=.n) (4=2) (5=1), gen(`vout') (1680 differences between housesat and C4jobsat) . label var `vout' "Agree paid job satisfies more?" . label val `vout' Lsa_sd . note `vout': 3=neutral in source was coded .n \ `tag' . tab `vin' `vout', m Housework | satisfies like | Agree paid job satisfies more? paid job? | 1_SA_Pos 2_A_Pos 3_D_Neg 4_SD_Neg | Total ---------------+--------------------------------------------+---------- 1StAgree | 0 0 0 190 | 190 2Agree | 0 0 503 0 | 503 3Neither | 0 0 0 0 | 432 4Disagree | 0 470 0 0 | 470 5StDisagree | 85 0 0 0 | 85 a_Can't choose | 0 0 0 0 | 106 b_Refused | 0 0 0 0 | 12 ---------------+--------------------------------------------+---------- Total | 85 470 503 190 | 1,798 Housework | satisfies like | Agree paid job satisfies more? paid job? | a_Unsure b_Refused n_Neutral | Total ---------------+---------------------------------+---------- 1StAgree | 0 0 0 | 190 2Agree | 0 0 0 | 503 3Neither | 0 0 432 | 432 4Disagree | 0 0 0 | 470 5StDisagree | 0 0 0 | 85 a_Can't choose | 106 0 0 | 106 b_Refused | 0 12 0 | 12 ---------------+---------------------------------+---------- Total | 106 12 432 | 1,798 . . * workbest: 1=SA work is best for independence . * C4indep: 1=SA job gives indep (not reversed) . local vin workbest . local vout C4indep . recode `vin' (1=1) (2=2) (3=.n) (4=3) (5=4), gen(`vout') (438 differences between workbest and C4indep) . label var `vout' "Agree work creates independence?" . label val `vout' Lsa_sd . note `vout': 3=neutral in source was coded .n \ `tag' . tab `vin' `vout', m Work best for | women's | Agree work creates independence? independence? | 1_SA_Pos 2_A_Pos 3_D_Neg 4_SD_Neg | Total ---------------+--------------------------------------------+---------- 1StAgree | 505 0 0 0 | 505 2Agree | 0 767 0 0 | 767 3Neither | 0 0 0 0 | 272 4Disagree | 0 0 143 0 | 143 5StDisagree | 0 0 0 23 | 23 a_Can't choose | 0 0 0 0 | 76 b_Refused | 0 0 0 0 | 12 ---------------+--------------------------------------------+---------- Total | 505 767 143 23 | 1,798 Work best for | women's | Agree work creates independence? independence? | a_Unsure b_Refused n_Neutral | Total ---------------+---------------------------------+---------- 1StAgree | 0 0 0 | 505 2Agree | 0 0 0 | 767 3Neither | 0 0 272 | 272 4Disagree | 0 0 0 | 143 5StDisagree | 0 0 0 | 23 a_Can't choose | 76 0 0 | 76 b_Refused | 0 12 0 | 12 ---------------+---------------------------------+---------- Total | 76 12 272 | 1,798 . . // #4 . // check new variables . . * descriptives . codebook C4*, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- C4warm 1568 4 1.989796 1 4 Working mom can have warm relations? C4kids 1483 4 2.973702 1 4 Kids don't suffer with working mom? C4family 1481 4 2.964213 1 4 Family life doesn't suffer? C4nohome 1352 4 2.804734 1 4 Agree women don't want home and k... C4jobsat 1248 4 2.639423 1 4 Agree paid job satisfies more? C4indep 1438 4 1.78025 1 4 Agree work creates independence? -------------------------------------------------------------------------------- . * correlations . pwcorr C4*, obs | C4warm C4kids C4family C4nohome C4jobsat C4indep -------------+------------------------------------------------------ C4warm | 1.0000 | 1568 | C4kids | 0.1942 1.0000 | 1311 1483 | C4family | 0.2146 0.6464 1.0000 | 1314 1312 1481 | C4nohome | 0.0880 0.3085 0.3643 1.0000 | 1208 1157 1161 1352 | C4jobsat | -0.0026 0.2066 0.2708 0.4942 1.0000 | 1119 1071 1071 1033 1248 | C4indep | 0.0803 -0.0210 0.0159 0.0828 0.1752 1.0000 | 1279 1217 1211 1122 1058 1438 | . * binary compared to 4 category scales . foreach s in warm kids family nohome jobsat indep { 2. pwcorr B`s' C4`s', obs 3. } | Bwarm C4warm -------------+------------------ Bwarm | 1.0000 | 1568 | C4warm | -0.8239 1.0000 | 1568 1568 | | Bkids C4kids -------------+------------------ Bkids | 1.0000 | 1483 | C4kids | -0.8114 1.0000 | 1483 1483 | | Bfamily C4family -------------+------------------ Bfamily | 1.0000 | 1481 | C4family | -0.8223 1.0000 | 1481 1481 | | Bnohome C4nohome -------------+------------------ Bnohome | 1.0000 | 1352 | C4nohome | -0.8510 1.0000 | 1352 1352 | | Bjobsat C4jobsat -------------+------------------ Bjobsat | 1.0000 | 1248 | C4jobsat | -0.8657 1.0000 | 1248 1248 | | Bindep C4indep -------------+------------------ Bindep | 1.0000 | 1438 | C4indep | -0.7186 1.0000 | 1438 1438 | . * 4 category and 5 category correlations . pwcorr momwarm C4warm, obs // not reversed | momwarm C4warm -------------+------------------ momwarm | 1.0000 | 1765 | C4warm | 0.9785 1.0000 | 1568 1568 | . pwcorr kidsuffer C4kids, obs // reversed | kidsuf~r C4kids -------------+------------------ kidsuffer | 1.0000 | 1755 | C4kids | -0.9747 1.0000 | 1483 1483 | . pwcorr famsuffer C4family, obs // reversed | famsuf~r C4family -------------+------------------ famsuffer | 1.0000 | 1759 | C4family | -0.9771 1.0000 | 1481 1481 | . pwcorr wanthome C4nohome, obs // reversed | wanthome C4nohome -------------+------------------ wanthome | 1.0000 | 1717 | C4nohome | -0.9793 1.0000 | 1352 1352 | . pwcorr housesat C4jobsat, obs // reversed | housesat C4jobsat -------------+------------------ housesat | 1.0000 | 1680 | C4jobsat | -0.9808 1.0000 | 1248 1248 | . pwcorr workbest C4indep, obs // not reversed | workbest C4indep -------------+------------------ workbest | 1.0000 | 1710 | C4indep | 0.9716 1.0000 | 1438 1438 | . . // #5 . // cleanup and save . . sort id . qui compress . label data "Workflow example using ISSP 2002 Russia \ 2008-10-24" . note: wf-russia03.dta `tag' . * in stata 10 and later: datasignature set, reset . save wf-russia03, replace file wf-russia03.dta saved . . // #6 . // check the changes . . use wf-russia03, clear (Workflow example using ISSP 2002 Russia \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta: 1. wf-russia01.dta \ wf-isspru01.dta \ wf-russia01-support.do jsl 2008-04-02. 2. wf-russia02.dta \ wf6-create01.do jsl 2008-10-24. 3. x-wf6-create02-binary.dta \ wf6-create02.do jsl 2008-10-24. 4. wf-russia03.dta wf6-create03-noneutral.do jsl 2008-10-24. id: 1. clone of v2 wf-russia01-support.do jsl 2008-04-02. momwarm: 1. clone of v4 wf-russia01-support.do jsl 2008-04-02. 2. low values == pro working women. kidsuffer: 1. clone of v5 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. famsuffer: 1. clone of v6 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. wanthome: 1. clone of v7 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. housesat: 1. clone of v8 wf-russia01-support.do jsl 2008-04-02. 2. high values == pro working women. workbest: 1. clone of v9 wf-russia01-support.do jsl 2008-04-02. 2. low values == pro working women. gender: 1. clone of v200 wf-russia01-support.do jsl 2008-04-02. age: 1. clone of v201 wf-russia01-support.do jsl 2008-04-02. marstat: 1. clone of v202 wf-russia01-support.do jsl 2008-04-02. edyears: 1. copy of v204 wf-russia01-support.do jsl 2008-04-02. edlevel: 1. clone of v232 wf-russia01-support.do jsl 2008-04-02. empstat: 1. clone of v239 wf-russia01-support.do jsl 2008-04-02. earnings: 1. clone of v249 wf-russia01-support.do jsl 2008-04-02. female: 1. based on gender \ wf6-create01.do jsl 2008-10-24. male: 1. based on gender \ wf6-create01.do jsl 2008-10-24. married: 1. recoding of marstat \ married includes married, widowed, divorced, separated \ wf6-create01.do jsl 2008-10-24. hidegree: 1. recode of edlevel \ wf6-create01.do jsl 2008-10-24. fulltime: 1. recoding of empstat; includes fulltime & retired \ wf6-create01.do jsl 2008-10-24. Bwarm: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bkids: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bfamily: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bnohome: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bjobsat: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. Bindep: 1. 3=neutral in source was coded .n \ wf6-create02.do jsl 2008-10-24. C4warm: 1. 3=neutral in source was coded .n \ wf6-create03-noneutral.do jsl 2008-10-24. C4kids: 1. 3=neutral in source was coded .n \ wf6-create03-noneutral.do jsl 2008-10-24. C4family: 1. 3=neutral in source was coded .n \ wf6-create03-noneutral.do jsl 2008-10-24. C4nohome: 1. 3=neutral in source was coded .n \ wf6-create03-noneutral.do jsl 2008-10-24. C4jobsat: 1. 3=neutral in source was coded .n \ wf6-create03-noneutral.do jsl 2008-10-24. C4indep: 1. 3=neutral in source was coded .n \ wf6-create03-noneutral.do jsl 2008-10-24. . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id 1798 1798 1800900 1800001 1801798 Respondent number momwarm 1765 5 2.324646 1 5 Working mom can have warm... kidsuffer 1755 5 2.373789 1 5 Pre-school child suffers? famsuffer 1759 5 2.401933 1 5 Family life suffers? wanthome 1717 5 2.639487 1 5 Women really want is home... housesat 1680 5 2.855357 1 5 Housework satisfies like ... workbest 1710 5 2.071345 1 5 Work best for women's ind... gender 1798 2 1.613459 1 2 Gender: 1=male, 2=female age 1798 71 46.87875 18 91 Age in years marstat 1779 5 2.105115 1 5 Marital status edyears 1639 21 11.57169 1 21 Years of schooling edlevel 1795 7 4.960446 1 7 Education level empstat 1765 9 3.774504 1 10 Current employment status earnings 1390 190 2424.938 25 90000 Earnings female 1798 2 .6134594 0 1 Female? male 1798 2 .3865406 0 1 Male? married 1779 2 .8431703 0 1 Ever married? hidegree 1795 2 .2250696 0 1 Any higher education? fulltime 1765 2 .7852691 0 1 Ever worked full time? Bwarm 1568 2 .75 0 1 Working mom can have warm... Bkids 1483 2 .2326365 0 1 Agree kids don't suffer w... Bfamily 1481 2 .2538825 0 1 Agree family life doesn't... Bnohome 1352 2 .3468935 0 1 Agree women don't want ho... Bjobsat 1248 2 .4447115 0 1 Agree paid job satisfies ... Bindep 1438 2 .8845619 0 1 Agree work creates indepe... C4warm 1568 4 1.989796 1 4 Working mom can have warm... C4kids 1483 4 2.973702 1 4 Kids don't suffer with wo... C4family 1481 4 2.964213 1 4 Family life doesn't suffer? C4nohome 1352 4 2.804734 1 4 Agree women don't want ho... C4jobsat 1248 4 2.639423 1 4 Agree paid job satisfies ... C4indep 1438 4 1.78025 1 4 Agree work creates indepe... -------------------------------------------------------------------------------- . cf _all using wf-russia02.dta id: 1798 mismatches momwarm: 1338 mismatches kidsuffer: 1298 mismatches famsuffer: 1299 mismatches wanthome: 1374 mismatches housesat: 1405 mismatches workbest: 1322 mismatches gender: 874 mismatches age: 1766 mismatches marstat: 1195 mismatches edyears: 1655 mismatches edlevel: 1183 mismatches empstat: 1237 mismatches earnings: 1719 mismatches female: 874 mismatches male: 874 mismatches married: 505 mismatches hidegree: 614 mismatches fulltime: 654 mismatches Bwarm: does not exist in using Bkids: does not exist in using Bfamily: does not exist in using Bnohome: does not exist in using Bjobsat: does not exist in using Bindep: does not exist in using C4warm: does not exist in using C4kids: does not exist in using C4family: does not exist in using C4nohome: does not exist in using C4jobsat: does not exist in using C4indep: does not exist in using r(9); . . log close log: D:\wf\work\wf6-create03-noneutral.log log type: text closed on: 24 Oct 2008, 09:41:44 -------------------------------------------------------------------------------- . exit end of do-file . . * merging . do wf6-merge-match.do . capture log close . log using wf6-merge-match, replace text (note: file D:\wf\work\wf6-merge-match.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-merge-match.log log type: text opened on: 24 Oct 2008, 09:41:44 . . // program: wf6-merge-match.do \ for stata 9 . // task: Example of match merging . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define local . . local date "2008-10-24" . local tag "wf6-merge-match.do jsl `date'." . . // #2 . // check signatures ande load the master dataset . . use wf-nls-flim05, clear (Workflow example with NLS FLIM variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . use wf-nls-cntrl07, clear (Workflow example with NLS control variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . . // #3 . // merge in the flim dataset and check variables . . merge id using wf-nls-flim05 . tab1 _merge -> tabulation of _merge _merge | Freq. Percent Cum. ------------+----------------------------------- 1 | 21 21.00 21.00 3 | 79 79.00 100.00 ------------+----------------------------------- Total | 100 100.00 . drop _merge . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id 100 100 50.5 1 100 ID_CODE, 68 mwA 100 1 0 0 0 Is MW cohort? ywA 100 1 1 1 1 Is YW cohort? birthyrA 100 13 48.02 41 53 Year of birth dropcaseA 100 1 0 0 0 Drop case from further analyses? dropwhyA 100 1 0 0 0 Reason case is to be dropped cohortA 94 2 4.553191 4 5 epsl cohort ageatp1A 100 13 22.98 18 30 Age at panel 1 ageatp2A 100 13 29.98 25 37 Age at panel 2 ageatp3A 100 13 34.98 30 42 Age at panel 3 ageatp4A 100 13 39.98 35 47 Age at panel 4 ageatp5A 100 13 42.98 38 50 Age at panel 5 race3catA 100 2 1.15 1 2 Race with three categories blackA 100 2 .15 0 1 Race: black othrraceA 100 1 0 0 0 Race: other whiteA 100 2 .85 0 1 Race: white educA 100 12 12.76 7 18 Years of education dlwrkp1 69 2 .0434783 0 1 Health limit work, p1 dlwrkp2 49 3 .0204082 -4 1 Health limit work, p2 dlwrkp3 42 2 .0952381 0 1 Health limit work, p3 dlwrkp4 47 3 .0212766 -4 1 Health limit work, p4 dlwrkp5 47 3 -.1702128 -4 1 Health limit work, p5 doutdp1 70 1 0 0 0 Hlth prevents go outdoors, p1 doutdp2 49 1 0 0 0 Hlth prevents go outdoors, p2 doutdp3 44 1 0 0 0 Hlth prevents go outdoors, p3 doutdp4 0 0 . . . Hlth prevents go outdoors, p4 doutdp5 0 0 . . . Hlth prevents go outdoors, p5 flhndp1 70 4 .2 0 4 FL hnd-p1 flhndp2 49 5 .8163265 0 5 FL hnd-p2 flhndp3 44 4 .5681818 0 4 FL hnd-p3 flhndp4 47 4 .893617 0 5 FL hnd-p4 flhndp5 47 3 .9361702 0 4 FL hnd-p5 flhvyp1 0 0 . . . FL hvy-p1 flhvyp2 49 5 1 0 5 FL hvy-p2 flhvyp3 44 5 .9545455 0 5 FL hvy-p3 flhvyp4 47 4 1.212766 0 5 FL hvy-p4 flhvyp5 47 4 1.297872 0 5 FL hvy-p5 fllftp1 70 5 .2428571 0 5 FL lft-p1 fllftp2 49 5 .7346939 0 5 FL lft-p2 fllftp3 44 3 .5227273 0 2 FL lft-p3 fllftp4 47 4 .9361702 0 5 FL lft-p4 fllftp5 47 4 1.021277 0 5 FL lft-p5 flrchp1 70 3 .1714286 0 2 FL rch-p1 flrchp2 49 4 .7755102 0 5 FL rch-p2 flrchp3 44 3 .5227273 0 2 FL rch-p3 flrchp4 47 4 .9361702 0 5 FL rch-p4 flrchp5 47 3 .893617 0 4 FL rch-p5 flsitp1 70 5 .2428571 0 5 FL sit-p1 flsitp2 49 5 .8367347 0 5 FL sit-p2 flsitp3 44 4 .6136364 0 4 FL sit-p3 flsitp4 47 4 1.085106 0 5 FL sit-p4 flsitp5 47 4 1.042553 0 5 FL sit-p5 flstdp1 70 4 .2 0 4 FL std-p1 flstdp2 49 5 .9591837 0 5 FL std-p2 flstdp3 44 4 .6136364 0 4 FL std-p3 flstdp4 47 4 1.170213 0 5 FL std-p4 flstdp5 47 4 1.212766 0 5 FL std-p5 flstpp1 70 4 .2 0 4 FL stp-p1 flstpp2 49 5 .877551 0 5 FL stp-p2 flstpp3 44 4 .6136364 0 4 FL stp-p3 flstpp4 47 4 1.042553 0 5 FL stp-p4 flstpp5 47 4 1.297872 0 5 FL stp-p5 flstrp1 70 4 .2 0 4 FL str-p1 flstrp2 49 5 .755102 0 5 FL str-p2 flstrp3 44 3 .5227273 0 2 FL str-p3 flstrp4 47 4 .893617 0 5 FL str-p4 flstrp5 47 3 1.021277 0 4 FL str-p5 flwlkp1 70 3 .1714286 0 2 FL wlk-p1 flwlkp2 49 5 .7346939 0 5 FL wlk-p2 flwlkp3 44 3 .5227273 0 2 FL wlk-p3 flwlkp4 47 4 .893617 0 5 FL wlk-p4 flwlkp5 47 3 .893617 0 4 FL wlk-p5 -------------------------------------------------------------------------------- . . // #4 . // check variables and save merged file . . quietly compress . label data "Workflow merged NLS flim & control variables \ `date'" . note: wf-nls-combined01.dta \ workflow data for chapter 6 \ `tag' . * in stata 10 and later: datasignature set, reset . save wf-nls-combined01, replace file wf-nls-combined01.dta saved . . use wf-nls-combined01, clear (Workflow merged NLS flim & control variables \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta: 1. wf-nls-cntrl07.dta \ revised cntrl07.dta \ wf-merge-support.do jsl 2008-04-02. 2. wf-nls-flim05.dta \ revised flim05.dta \ wf-merge-support.do jsl 2008-04-02. 3. wf-nls-combined01.dta \ workflow data for chapter 6 \ wf6-merge-match.do jsl 2008-10-24. . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- id 100 100 50.5 1 100 ID_CODE, 68 mwA 100 1 0 0 0 Is MW cohort? ywA 100 1 1 1 1 Is YW cohort? birthyrA 100 13 48.02 41 53 Year of birth dropcaseA 100 1 0 0 0 Drop case from further analyses? dropwhyA 100 1 0 0 0 Reason case is to be dropped cohortA 94 2 4.553191 4 5 epsl cohort ageatp1A 100 13 22.98 18 30 Age at panel 1 ageatp2A 100 13 29.98 25 37 Age at panel 2 ageatp3A 100 13 34.98 30 42 Age at panel 3 ageatp4A 100 13 39.98 35 47 Age at panel 4 ageatp5A 100 13 42.98 38 50 Age at panel 5 race3catA 100 2 1.15 1 2 Race with three categories blackA 100 2 .15 0 1 Race: black othrraceA 100 1 0 0 0 Race: other whiteA 100 2 .85 0 1 Race: white educA 100 12 12.76 7 18 Years of education dlwrkp1 69 2 .0434783 0 1 Health limit work, p1 dlwrkp2 49 3 .0204082 -4 1 Health limit work, p2 dlwrkp3 42 2 .0952381 0 1 Health limit work, p3 dlwrkp4 47 3 .0212766 -4 1 Health limit work, p4 dlwrkp5 47 3 -.1702128 -4 1 Health limit work, p5 doutdp1 70 1 0 0 0 Hlth prevents go outdoors, p1 doutdp2 49 1 0 0 0 Hlth prevents go outdoors, p2 doutdp3 44 1 0 0 0 Hlth prevents go outdoors, p3 doutdp4 0 0 . . . Hlth prevents go outdoors, p4 doutdp5 0 0 . . . Hlth prevents go outdoors, p5 flhndp1 70 4 .2 0 4 FL hnd-p1 flhndp2 49 5 .8163265 0 5 FL hnd-p2 flhndp3 44 4 .5681818 0 4 FL hnd-p3 flhndp4 47 4 .893617 0 5 FL hnd-p4 flhndp5 47 3 .9361702 0 4 FL hnd-p5 flhvyp1 0 0 . . . FL hvy-p1 flhvyp2 49 5 1 0 5 FL hvy-p2 flhvyp3 44 5 .9545455 0 5 FL hvy-p3 flhvyp4 47 4 1.212766 0 5 FL hvy-p4 flhvyp5 47 4 1.297872 0 5 FL hvy-p5 fllftp1 70 5 .2428571 0 5 FL lft-p1 fllftp2 49 5 .7346939 0 5 FL lft-p2 fllftp3 44 3 .5227273 0 2 FL lft-p3 fllftp4 47 4 .9361702 0 5 FL lft-p4 fllftp5 47 4 1.021277 0 5 FL lft-p5 flrchp1 70 3 .1714286 0 2 FL rch-p1 flrchp2 49 4 .7755102 0 5 FL rch-p2 flrchp3 44 3 .5227273 0 2 FL rch-p3 flrchp4 47 4 .9361702 0 5 FL rch-p4 flrchp5 47 3 .893617 0 4 FL rch-p5 flsitp1 70 5 .2428571 0 5 FL sit-p1 flsitp2 49 5 .8367347 0 5 FL sit-p2 flsitp3 44 4 .6136364 0 4 FL sit-p3 flsitp4 47 4 1.085106 0 5 FL sit-p4 flsitp5 47 4 1.042553 0 5 FL sit-p5 flstdp1 70 4 .2 0 4 FL std-p1 flstdp2 49 5 .9591837 0 5 FL std-p2 flstdp3 44 4 .6136364 0 4 FL std-p3 flstdp4 47 4 1.170213 0 5 FL std-p4 flstdp5 47 4 1.212766 0 5 FL std-p5 flstpp1 70 4 .2 0 4 FL stp-p1 flstpp2 49 5 .877551 0 5 FL stp-p2 flstpp3 44 4 .6136364 0 4 FL stp-p3 flstpp4 47 4 1.042553 0 5 FL stp-p4 flstpp5 47 4 1.297872 0 5 FL stp-p5 flstrp1 70 4 .2 0 4 FL str-p1 flstrp2 49 5 .755102 0 5 FL str-p2 flstrp3 44 3 .5227273 0 2 FL str-p3 flstrp4 47 4 .893617 0 5 FL str-p4 flstrp5 47 3 1.021277 0 4 FL str-p5 flwlkp1 70 3 .1714286 0 2 FL wlk-p1 flwlkp2 49 5 .7346939 0 5 FL wlk-p2 flwlkp3 44 3 .5227273 0 2 FL wlk-p3 flwlkp4 47 4 .893617 0 5 FL wlk-p4 flwlkp5 47 3 .893617 0 4 FL wlk-p5 -------------------------------------------------------------------------------- . * in stata 10 and later: datasignature confirm . . // #5 . // sorting before merging . . use wf-nls-cntrl07, clear (Workflow example with NLS control variables \ 2008-04-02) . * in stata 10 and later: datasignature confirm . merge id using wf-nls-flim05, sort . . log close log: D:\wf\work\wf6-merge-match.log log type: text closed on: 24 Oct 2008, 09:41:45 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-merge-onetoone.do . capture log close . log using wf6-merge-onetoone, replace text (note: file D:\wf\work\wf6-merge-onetoone.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-merge-onetoone.log log type: text opened on: 24 Oct 2008, 09:41:45 . . // program: wf6-merge-onetoone.do \ for stata 9 . // task: Example of one to one matching for . // unrelated datasets . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // define local . . local date "2008-10-24" . local tag "wf6-merge-onetoone.do jsl `date'." . . // #2 . // check the datasets . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lfp | 753 .5683931 .4956295 0 1 k5 | 753 .2377158 .523959 0 3 k618 | 753 1.353254 1.319874 0 8 age | 753 42.53785 8.072574 30 60 wc | 753 .2815405 .4500494 0 1 -------------+-------------------------------------------------------- hc | 753 .3917663 .4884694 0 1 lwg | 753 1.097115 .5875564 -2.054124 3.218876 inc | 753 20.12897 11.6348 -.0290001 96 . . use wf-acpub, clear (Workflow data on scientific productivity \ 2008-04-04) . * in stata 10 and later: datasignature confirm . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- id | 308 58654.49 2283.465 57001 62420 enrol | 278 5.92446 2.92346 3 25 female | 308 .3474026 .4769198 0 1 phd | 308 3.177987 1.012738 1 4.77 pub | 308 3.185065 3.908752 0 31 -------------+-------------------------------------------------------- enrol_fixed | 278 5.564748 1.467253 3 14 . . // #3 . // load the master dataset and merge with the using dataset . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . merge using wf-acpub . tabulate _merge _merge | Freq. Percent Cum. ------------+----------------------------------- 1 | 445 59.10 59.10 3 | 308 40.90 100.00 ------------+----------------------------------- Total | 753 100.00 . drop _merge . . // #4 . // clean up and save . . quietly compress . label data "Workflow example of combining unrelated datasets \ `date'" . note: wf-merge01.dta \ workflow examples from chapter 6 \ `tag' . . * in stata 10 and later: datasignature set, reset . save wf-merge01, replace file wf-merge01.dta saved . . use wf-merge01, clear (Workflow example of combining unrelated datasets \ 2008-10-24) . * in stata 10 and later: datasignature confirm . notes _dta: 1. wf-lfp.dta \ revised binlfp2.dta \ wf-binlfp-support.do jsl 2008-04-02. 2. Data are from 1976 PSID courtesy of T Mroz. 3. wf-acpub.dta \ revised science2.dta \ wf-acpub-support.do jsl 2008-04-04. 4. wf-merge01.dta \ workflow examples from chapter 6 \ wf6-merge-onetoone.do jsl 2008-10-24. . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- lfp 753 2 .5683931 0 1 In paid labor force? ... k5 753 4 .2377158 0 3 # kids < 6 k618 753 9 1.353254 0 8 # kids 6-18 age 753 31 42.53785 30 60 Wife's age in years wc 753 2 .2815405 0 1 Wife attended college... hc 753 2 .3917663 0 1 Husband attended coll... lwg 753 676 1.097115 -2.054124 3.218876 Log of wife's estimat... inc 753 621 20.12897 -.0290001 96 Family income excludi... id 308 308 58654.49 57001 62420 ID Number enrol 278 11 5.92446 3 25 Elapsed time from BS ... female 308 2 .3474026 0 1 Is female? phd 308 82 3.177987 1 4.77 Prestige of Ph.D. dep... pub 308 19 3.185065 0 31 Publications in years... enrol_fixed 278 9 5.564748 3 14 Enrolled time from BS... -------------------------------------------------------------------------------- . * in stata 10 and later: datasignature confirm . . log close log: D:\wf\work\wf6-merge-onetoone.log log type: text closed on: 24 Oct 2008, 09:41:45 -------------------------------------------------------------------------------- . exit end of do-file . do wf6-merge-nomatch.do . capture log close . log using wf6-merge-nomatch, replace text (note: file D:\wf\work\wf6-merge-nomatch.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf6-merge-nomatch.log log type: text opened on: 24 Oct 2008, 09:41:45 . . // program: wf6-merge-nomatch.do \ for stata 9 . // task: Example of botched match merging . // project: workflow chapter 6 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // check datasets . . use wf-mergebio, clear (Workflow biographical data to illustrate merging \ 2008-04-05) . * in stata 10 and later: datasignature confirm . use wf-mergebib, clear (Workflow bibliographic data to illustrate merging \ 2008-04-05) . * in stata 10 and later: datasignature confirm . . // #2 . // incorrect merging . . use wf-mergebio, clear (Workflow biographical data to illustrate merging \ 2008-04-05) . merge using wf-mergebib . tab1 _merge -> tabulation of _merge _merge | Freq. Percent Cum. ------------+----------------------------------- 3 | 408 100.00 100.00 ------------+----------------------------------- Total | 408 100.00 . drop _merge . . * check new data . codebook, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- job 408 80 2.233431 1 4.8 Prestige of first job fem 408 2 .3897059 0 1 Gender: 1=female 0=male phd 408 89 3.200564 1 4.8 PhD prestige ment 408 123 45.47058 0 531.9999 Citations received by mentor id 408 408 204.5 1 408 ID number art 408 14 2.276961 0 18 # of articles published cit 408 87 21.71569 0 203 # of citations received -------------------------------------------------------------------------------- . pwcorr job fem phd ment art cit | job fem phd ment art cit -------------+------------------------------------------------------ job | 1.0000 fem | -0.1076 1.0000 phd | 0.3636 -0.0550 1.0000 ment | 0.2129 -0.0100 0.3253 1.0000 art | -0.3534 0.0713 -0.9115 -0.2829 1.0000 cit | -0.2210 0.0850 -0.6700 -0.2126 0.7340 1.0000 . . // #3 . // correct merging . . use wf-mergebio, clear (Workflow biographical data to illustrate merging \ 2008-04-05) . merge id using wf-mergebib, sort (Workflow bibliographic data to illustrate merging \ 2008-04-05) . tab1 _merge -> tabulation of _merge _merge | Freq. Percent Cum. ------------+----------------------------------- 3 | 408 100.00 100.00 ------------+----------------------------------- Total | 408 100.00 . drop _merge . pwcorr job fem phd ment art cit | job fem phd ment art cit -------------+------------------------------------------------------ job | 1.0000 fem | -0.1076 1.0000 phd | 0.3636 -0.0550 1.0000 ment | 0.2129 -0.0100 0.3253 1.0000 art | 0.2622 -0.1718 0.1534 0.1299 1.0000 cit | 0.3038 -0.0776 0.2284 0.1531 0.7340 1.0000 . . log close log: D:\wf\work\wf6-merge-nomatch.log log type: text closed on: 24 Oct 2008, 09:41:45 -------------------------------------------------------------------------------- . exit end of do-file . . log close master log: D:\wf\work\wf6.log log type: text closed on: 24 Oct 2008, 09:41:45 -------------------------------------------------------------------------------- . exit end of do-file . do wf7.do . capture log close master . log using wf7, name(master) replace text (note: file D:\wf\work\wf7.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7.log log type: text opened on: 24 Oct 2008, 09:41:45 . . // program: wf7.do \ for stata 9 . // task: run all do-files in the order they appear . // project: workflow - chapter 7 . // author: scott long \ 2008-10-24 . . * add caption documenting do-file to graphs . do wf7-caption.do . capture log close . log using wf7-caption, replace text (note: file D:\wf\work\wf7-caption.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-caption.log log type: text opened on: 24 Oct 2008, 09:41:45 . . // pgm: wf7-caption.do \ for stata 9 . // task: adding a caption to show graph surce . // project: workflow chapter 7 . // author: jsl / 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . set scheme s2manual . . // #1 . // create data to plot . . clear . set obs 51 obs was 0, now 51 . generate articles = _n - 1 . label var articles "Number of publications" . . * art_root# = (articles)^(1/#) . forvalues r = 1(1)5 { 2. * take to the 1/r power . gen art_root`r' = articles^(1/`r') 3. label var art_root`r' "articles^(1/`r')" 4. } . label var art_root2 "2nd root" . label var art_root3 "3rd root" . label var art_root4 "4th root" . label var art_root5 "5th root" . . // #2 . // plot results without caption . . twoway (line art_root2 art_root3 art_root4 art_root5 articles, /// > lwidth(medium)), ytitle(Number of Publications to the k-th Root) /// > yscale(range(0 8.)) legend(pos(11) rows(4) ring(0)) . graph export wf7-caption-without.eps, replace (file wf7-caption-without.eps written in EPS format) . . // #3 . // plot results with caption . . twoway (line art_root2 art_root3 art_root4 art_root5 articles, /// > lwidth(medium)), ytitle(Number of Publications to the k-th Root) /// > yscale(range(0 8.)) legend(pos(11) rows(4) ring(0)) /// > caption(wf7-caption.do jsl 2008-10-24, size(vsmall)) . graph export wf7-caption-with.eps, replace (file wf7-caption-with.eps written in EPS format) . . log close log: D:\wf\work\wf7-caption.log log type: text closed on: 24 Oct 2008, 09:41:47 -------------------------------------------------------------------------------- . exit end of do-file . . * using locals with variable names . do wf7-locals.do . capture log close . log using wf7-locals, replace text (note: file D:\wf\work\wf7-locals.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-locals.log log type: text opened on: 24 Oct 2008, 09:41:47 . . // pgm: wf7-locals.do \ for stata 9 . // task: automation - locals . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // CODEBOOK WITHOUT USING A LOCAL . . // #2 . // desc statistics for men & women combined . . codebook female male tenure year yearsq select articles prestige, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 2797 2 .3775474 0 1 Scientist is female? male 2797 2 .6224526 0 1 Is male? tenure 2797 2 .1229889 0 1 Is tenured? year 2797 10 3.855917 1 10 Years in rank yearsq 2797 10 20.16911 1 100 Years in rank squared select 2797 8 4.995048 1 7 Baccalaureate selectivity articles 2797 48 7.050411 0 73 Total number of articles prestige 2797 98 2.646591 .65 4.8 Prestige of department -------------------------------------------------------------------------------- . . // #3 . // desc statistics for women . . codebook female male tenure year yearsq select articles prestige /// > if female, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1056 1 1 1 1 Scientist is female? male 1056 1 0 0 0 Is male? tenure 1056 2 .1089015 0 1 Is tenured? year 1056 10 3.974432 1 10 Years in rank yearsq 1056 10 21.45739 1 100 Years in rank squared select 1056 8 5.000852 1 7 Baccalaureate selectivity articles 1056 44 7.414773 0 73 Total number of articles prestige 1056 71 2.658144 .9 4.8 Prestige of department -------------------------------------------------------------------------------- . . // #4 . // desc statistics for men . . codebook female male tenure year yearsq select articles prestige /// > if male, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1741 1 0 0 0 Scientist is female? male 1741 1 1 1 1 Is male? tenure 1741 2 .1315336 0 1 Is tenured? year 1741 10 3.784032 1 10 Years in rank yearsq 1741 10 19.38771 1 100 Years in rank squared select 1741 8 4.991528 1 7 Baccalaureate selectivity articles 1741 37 6.829408 0 49 Total number of articles prestige 1741 81 2.639584 .65 4.64 Prestige of department -------------------------------------------------------------------------------- . . // CODEBOOK USING A LOCAL . . local varset "female male tenure year yearsq select articles prestige" . . // #5 . // desc statistics for men & women combined . . codebook `varset', compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 2797 2 .3775474 0 1 Scientist is female? male 2797 2 .6224526 0 1 Is male? tenure 2797 2 .1229889 0 1 Is tenured? year 2797 10 3.855917 1 10 Years in rank yearsq 2797 10 20.16911 1 100 Years in rank squared select 2797 8 4.995048 1 7 Baccalaureate selectivity articles 2797 48 7.050411 0 73 Total number of articles prestige 2797 98 2.646591 .65 4.8 Prestige of department -------------------------------------------------------------------------------- . . // #6 . // desc statistics for women . . codebook `varset' if female, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1056 1 1 1 1 Scientist is female? male 1056 1 0 0 0 Is male? tenure 1056 2 .1089015 0 1 Is tenured? year 1056 10 3.974432 1 10 Years in rank yearsq 1056 10 21.45739 1 100 Years in rank squared select 1056 8 5.000852 1 7 Baccalaureate selectivity articles 1056 44 7.414773 0 73 Total number of articles prestige 1056 71 2.658144 .9 4.8 Prestige of department -------------------------------------------------------------------------------- . . // #7 . // desc statistics for men . . codebook `varset' if male, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1741 1 0 0 0 Scientist is female? male 1741 1 1 1 1 Is male? tenure 1741 2 .1315336 0 1 Is tenured? year 1741 10 3.784032 1 10 Years in rank yearsq 1741 10 19.38771 1 100 Years in rank squared select 1741 8 4.991528 1 7 Baccalaureate selectivity articles 1741 37 6.829408 0 49 Total number of articles prestige 1741 81 2.639584 .65 4.64 Prestige of department -------------------------------------------------------------------------------- . . // #8 . // nested models predicting tenure - without using locals . . // #8a = baseline gender only model . . logit tenure female, nolog or Logistic regression Number of obs = 2797 LR chi2(1) = 3.17 Prob > chi2 = 0.0752 Log likelihood = -1041.2452 Pseudo R2 = 0.0015 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .8069089 .0981196 -1.76 0.078 .6357978 1.024071 ------------------------------------------------------------------------------ . . // #8b + time . . logit tenure female year yearsq, nolog or Logistic regression Number of obs = 2797 LR chi2(3) = 348.73 Prob > chi2 = 0.0000 Log likelihood = -868.46481 Pseudo R2 = 0.1672 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7231118 .0933734 -2.51 0.012 .5614256 .9313624 year | 6.079157 .9791881 11.21 0.000 4.433409 8.335832 yearsq | .8785368 .0121689 -9.35 0.000 .8550071 .9027141 ------------------------------------------------------------------------------ . . // #8c + department . . logit tenure female year yearsq select prestige, nolog or Logistic regression Number of obs = 2797 LR chi2(5) = 365.86 Prob > chi2 = 0.0000 Log likelihood = -859.89742 Pseudo R2 = 0.1754 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7208627 .0936383 -2.52 0.012 .5588349 .9298686 year | 6.161345 .9978713 11.23 0.000 4.485571 8.463176 yearsq | .8778896 .0122234 -9.35 0.000 .8542561 .902177 select | 1.151231 .0519671 3.12 0.002 1.053753 1.257726 prestige | .7697678 .0639568 -3.15 0.002 .6540891 .9059049 ------------------------------------------------------------------------------ . . // #8d + productivity . . logit tenure female year yearsq select prestige articles, nolog or Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7020495 .09273 -2.68 0.007 .5419223 .9094912 year | 5.602668 .9179641 10.52 0.000 4.063783 7.724302 yearsq | .8821916 .0124554 -8.88 0.000 .8581142 .9069446 select | 1.166965 .0537258 3.35 0.001 1.066275 1.277162 prestige | .6612555 .0583588 -4.69 0.000 .5562204 .7861251 articles | 1.056356 .0090561 6.40 0.000 1.038754 1.074255 ------------------------------------------------------------------------------ . . . // #9 . // nested models predicting tenure - with locals . . // define groups of variables . . local Vtime "year yearsq" // time in rank . local Vdept "select prestige" // characteristics of departments . local Vprod "articles" // research productivity . . // #9a = baseline gender only model . . logit tenure female, nolog or Logistic regression Number of obs = 2797 LR chi2(1) = 3.17 Prob > chi2 = 0.0752 Log likelihood = -1041.2452 Pseudo R2 = 0.0015 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .8069089 .0981196 -1.76 0.078 .6357978 1.024071 ------------------------------------------------------------------------------ . . // #9b + time . . logit tenure female `Vtime', nolog or Logistic regression Number of obs = 2797 LR chi2(3) = 348.73 Prob > chi2 = 0.0000 Log likelihood = -868.46481 Pseudo R2 = 0.1672 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7231118 .0933734 -2.51 0.012 .5614256 .9313624 year | 6.079157 .9791881 11.21 0.000 4.433409 8.335832 yearsq | .8785368 .0121689 -9.35 0.000 .8550071 .9027141 ------------------------------------------------------------------------------ . . // #9c + department . . logit tenure female `Vtime' `Vdept', nolog or Logistic regression Number of obs = 2797 LR chi2(5) = 365.86 Prob > chi2 = 0.0000 Log likelihood = -859.89742 Pseudo R2 = 0.1754 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7208627 .0936383 -2.52 0.012 .5588349 .9298686 year | 6.161345 .9978713 11.23 0.000 4.485571 8.463176 yearsq | .8778896 .0122234 -9.35 0.000 .8542561 .902177 select | 1.151231 .0519671 3.12 0.002 1.053753 1.257726 prestige | .7697678 .0639568 -3.15 0.002 .6540891 .9059049 ------------------------------------------------------------------------------ . . // #9d + productivity . . logit tenure female `Vtime' `Vdept' `Vprod', nolog or Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7020495 .09273 -2.68 0.007 .5419223 .9094912 year | 5.602668 .9179641 10.52 0.000 4.063783 7.724302 yearsq | .8821916 .0124554 -8.88 0.000 .8581142 .9069446 select | 1.166965 .0537258 3.35 0.001 1.066275 1.277162 prestige | .6612555 .0583588 -4.69 0.000 .5562204 .7861251 articles | 1.056356 .0090561 6.40 0.000 1.038754 1.074255 ------------------------------------------------------------------------------ . . log close log: D:\wf\work\wf7-locals.log log type: text closed on: 24 Oct 2008, 09:41:48 -------------------------------------------------------------------------------- . exit end of do-file . . * loops to run analysis commands . do wf7-loops-ttest.do . capture log close . log using wf7-loops-ttest, replace text (note: file D:\wf\work\wf7-loops-ttest.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-loops-ttest.log log type: text opened on: 24 Oct 2008, 09:41:48 . . // pgm: wf7-loops-ttest.do \ for stata 9 . // task: using loops for t-tests . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . tabulate sampleis Sample for | tenure | analysis | Freq. Percent Cum. ------------+----------------------------------- 0_Not | 148 5.03 5.03 1_InSample | 2,797 94.97 100.00 ------------+----------------------------------- Total | 2,945 100.00 . keep if sampleis (148 observations deleted) . . // #2 . // ttest gender differences without a loop . . ttest tenure, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 .1315336 .0081025 .3380801 .1156419 .1474253 1_Female | 1056 .1089015 .0095908 .3116632 .0900824 .1277207 ---------+-------------------------------------------------------------------- combined | 2797 .1229889 .0062111 .3284832 .1108102 .1351677 ---------+-------------------------------------------------------------------- diff | .0226321 .0128075 -.002481 .0477451 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = 1.7671 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.9613 Pr(|T| > |t|) = 0.0773 Pr(T > t) = 0.0387 . ttest year, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 3.784032 .0539732 2.252048 3.678173 3.889891 1_Female | 1056 3.974432 .0732539 2.380471 3.830692 4.118172 ---------+-------------------------------------------------------------------- combined | 2797 3.855917 .0435423 2.302805 3.770539 3.941295 ---------+-------------------------------------------------------------------- diff | -.1903997 .0897636 -.3664094 -.01439 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.1211 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0170 Pr(|T| > |t|) = 0.0340 Pr(T > t) = 0.9830 . ttest select, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 4.991528 .0327213 1.365305 4.927351 5.055705 1_Female | 1056 5.000852 .0453902 1.475006 4.911787 5.089918 ---------+-------------------------------------------------------------------- combined | 2797 4.995048 .026613 1.407473 4.942865 5.047231 ---------+-------------------------------------------------------------------- diff | -.0093244 .0549074 -.1169875 .0983386 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.1698 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.4326 Pr(|T| > |t|) = 0.8652 Pr(T > t) = 0.5674 . ttest articles, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 6.829408 .1435568 5.98995 6.547846 7.11097 1_Female | 1056 7.414773 .2286447 7.430074 6.966123 7.863423 ---------+-------------------------------------------------------------------- combined | 2797 7.050411 .1243353 6.575682 6.806613 7.294209 ---------+-------------------------------------------------------------------- diff | -.5853643 .2562881 -1.087897 -.0828313 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.2840 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0112 Pr(|T| > |t|) = 0.0224 Pr(T > t) = 0.9888 . ttest prestige, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 2.639584 .0187939 .7841798 2.602723 2.676445 1_Female | 1056 2.658144 .0235465 .7651707 2.611941 2.704347 ---------+-------------------------------------------------------------------- combined | 2797 2.646591 .0146913 .7769724 2.617784 2.675398 ---------+-------------------------------------------------------------------- diff | -.0185604 .0303088 -.0779903 .0408696 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.6124 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.2702 Pr(|T| > |t|) = 0.5403 Pr(T > t) = 0.7298 . . // #3 . // ttest gender differences using a loop with no header . . local varlist "tenure year select articles prestige" . foreach var in `varlist' { 2. ttest `var', by(female) 3. } Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 .1315336 .0081025 .3380801 .1156419 .1474253 1_Female | 1056 .1089015 .0095908 .3116632 .0900824 .1277207 ---------+-------------------------------------------------------------------- combined | 2797 .1229889 .0062111 .3284832 .1108102 .1351677 ---------+-------------------------------------------------------------------- diff | .0226321 .0128075 -.002481 .0477451 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = 1.7671 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.9613 Pr(|T| > |t|) = 0.0773 Pr(T > t) = 0.0387 Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 3.784032 .0539732 2.252048 3.678173 3.889891 1_Female | 1056 3.974432 .0732539 2.380471 3.830692 4.118172 ---------+-------------------------------------------------------------------- combined | 2797 3.855917 .0435423 2.302805 3.770539 3.941295 ---------+-------------------------------------------------------------------- diff | -.1903997 .0897636 -.3664094 -.01439 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.1211 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0170 Pr(|T| > |t|) = 0.0340 Pr(T > t) = 0.9830 Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 4.991528 .0327213 1.365305 4.927351 5.055705 1_Female | 1056 5.000852 .0453902 1.475006 4.911787 5.089918 ---------+-------------------------------------------------------------------- combined | 2797 4.995048 .026613 1.407473 4.942865 5.047231 ---------+-------------------------------------------------------------------- diff | -.0093244 .0549074 -.1169875 .0983386 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.1698 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.4326 Pr(|T| > |t|) = 0.8652 Pr(T > t) = 0.5674 Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 6.829408 .1435568 5.98995 6.547846 7.11097 1_Female | 1056 7.414773 .2286447 7.430074 6.966123 7.863423 ---------+-------------------------------------------------------------------- combined | 2797 7.050411 .1243353 6.575682 6.806613 7.294209 ---------+-------------------------------------------------------------------- diff | -.5853643 .2562881 -1.087897 -.0828313 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.2840 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0112 Pr(|T| > |t|) = 0.0224 Pr(T > t) = 0.9888 Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 2.639584 .0187939 .7841798 2.602723 2.676445 1_Female | 1056 2.658144 .0235465 .7651707 2.611941 2.704347 ---------+-------------------------------------------------------------------- combined | 2797 2.646591 .0146913 .7769724 2.617784 2.675398 ---------+-------------------------------------------------------------------- diff | -.0185604 .0303088 -.0779903 .0408696 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.6124 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.2702 Pr(|T| > |t|) = 0.5403 Pr(T > t) = 0.7298 . . // #4 . // ttest gender differences using a loop with a header . . local varlist "tenure year select articles prestige" . foreach var in `varlist' { 2. * echo command . di _new ". ttest `var', by(female)" 3. ttest `var', by(female) 4. } . ttest tenure, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 .1315336 .0081025 .3380801 .1156419 .1474253 1_Female | 1056 .1089015 .0095908 .3116632 .0900824 .1277207 ---------+-------------------------------------------------------------------- combined | 2797 .1229889 .0062111 .3284832 .1108102 .1351677 ---------+-------------------------------------------------------------------- diff | .0226321 .0128075 -.002481 .0477451 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = 1.7671 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.9613 Pr(|T| > |t|) = 0.0773 Pr(T > t) = 0.0387 . ttest year, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 3.784032 .0539732 2.252048 3.678173 3.889891 1_Female | 1056 3.974432 .0732539 2.380471 3.830692 4.118172 ---------+-------------------------------------------------------------------- combined | 2797 3.855917 .0435423 2.302805 3.770539 3.941295 ---------+-------------------------------------------------------------------- diff | -.1903997 .0897636 -.3664094 -.01439 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.1211 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0170 Pr(|T| > |t|) = 0.0340 Pr(T > t) = 0.9830 . ttest select, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 4.991528 .0327213 1.365305 4.927351 5.055705 1_Female | 1056 5.000852 .0453902 1.475006 4.911787 5.089918 ---------+-------------------------------------------------------------------- combined | 2797 4.995048 .026613 1.407473 4.942865 5.047231 ---------+-------------------------------------------------------------------- diff | -.0093244 .0549074 -.1169875 .0983386 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.1698 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.4326 Pr(|T| > |t|) = 0.8652 Pr(T > t) = 0.5674 . ttest articles, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 6.829408 .1435568 5.98995 6.547846 7.11097 1_Female | 1056 7.414773 .2286447 7.430074 6.966123 7.863423 ---------+-------------------------------------------------------------------- combined | 2797 7.050411 .1243353 6.575682 6.806613 7.294209 ---------+-------------------------------------------------------------------- diff | -.5853643 .2562881 -1.087897 -.0828313 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.2840 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0112 Pr(|T| > |t|) = 0.0224 Pr(T > t) = 0.9888 . ttest prestige, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 2.639584 .0187939 .7841798 2.602723 2.676445 1_Female | 1056 2.658144 .0235465 .7651707 2.611941 2.704347 ---------+-------------------------------------------------------------------- combined | 2797 2.646591 .0146913 .7769724 2.617784 2.675398 ---------+-------------------------------------------------------------------- diff | -.0185604 .0303088 -.0779903 .0408696 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.6124 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.2702 Pr(|T| > |t|) = 0.5403 Pr(T > t) = 0.7298 . . log close log: D:\wf\work\wf7-loops-ttest.log log type: text closed on: 24 Oct 2008, 09:41:49 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-loops-arttran.do . capture log close . log using wf7-loops-arttran, replace text (note: file D:\wf\work\wf7-loops-arttran.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-loops-arttran.log log type: text opened on: 24 Oct 2008, 09:41:49 . . // pgm: wf7-loops-arttran.do \ for stata 9 . // task: using loops for logits with root transformations of articles . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // create art_root# = (articles)^(1/#) . . * local to hold list of variables with transformed articles . local artvars "" . . * loop through roots 1 through 9 . forvalues root = 1(1)9 { 2. * take to the 1/root power . gen art_root`root' = articles^(1/`root') 3. label var art_root`root' "articles^(1/`root')" 4. * add new variable to the list . local artvars "`artvars' art_root`root'" 5. } . . // #4 . // loop through models . . foreach avar in `artvars' { 2. logit tenure `avar' female year yearsq select prestige, nolog 3. } Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root1 | .0548251 .008573 6.40 0.000 .0380224 .0716278 female | -.3537514 .1320848 -2.68 0.007 -.6126327 -.09487 year | 1.723243 .1638441 10.52 0.000 1.402114 2.044371 yearsq | -.125346 .0141187 -8.88 0.000 -.1530181 -.0976739 select | .1544061 .0460389 3.35 0.001 .0641715 .2446407 prestige | -.413615 .0882545 -4.69 0.000 -.5865907 -.2406394 _cons | -6.812655 .5290562 -12.88 0.000 -7.849586 -5.775724 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 425.01 Prob > chi2 = 0.0000 Log likelihood = -830.32437 Pseudo R2 = 0.2038 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root2 | .4332367 .0576679 7.51 0.000 .3202097 .5462636 female | -.340382 .1324709 -2.57 0.010 -.6000202 -.0807438 year | 1.674869 .1646972 10.17 0.000 1.352068 1.99767 yearsq | -.1219834 .0141757 -8.61 0.000 -.1497674 -.0941995 select | .1545827 .0462078 3.35 0.001 .064017 .2451483 prestige | -.4437035 .088657 -5.00 0.000 -.6174681 -.269939 _cons | -7.307125 .5357716 -13.64 0.000 -8.357218 -6.257032 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 424.42 Prob > chi2 = 0.0000 Log likelihood = -830.61834 Pseudo R2 = 0.2035 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root3 | .8999652 .1232629 7.30 0.000 .6583744 1.141556 female | -.3320245 .1322814 -2.51 0.012 -.5912913 -.0727578 year | 1.666784 .1647204 10.12 0.000 1.343938 1.989629 yearsq | -.12122 .0141645 -8.56 0.000 -.1489818 -.0934581 select | .1520817 .0460792 3.30 0.001 .061768 .2423953 prestige | -.4357212 .0880958 -4.95 0.000 -.6083859 -.2630566 _cons | -7.837491 .5514876 -14.21 0.000 -8.918386 -6.756595 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 420.61 Prob > chi2 = 0.0000 Log likelihood = -832.52554 Pseudo R2 = 0.2017 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root4 | 1.330429 .193361 6.88 0.000 .9514482 1.709409 female | -.3270201 .1320287 -2.48 0.013 -.5857916 -.0682486 year | 1.668452 .16459 10.14 0.000 1.345861 1.991042 yearsq | -.1211684 .0141445 -8.57 0.000 -.148891 -.0934458 select | .1499813 .045934 3.27 0.001 .0599523 .2400103 prestige | -.4233462 .087572 -4.83 0.000 -.5949842 -.2517083 _cons | -8.290543 .5762453 -14.39 0.000 -9.419963 -7.161123 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 415.99 Prob > chi2 = 0.0000 Log likelihood = -834.83473 Pseudo R2 = 0.1995 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root5 | 1.695841 .2643407 6.42 0.000 1.177743 2.213939 female | -.323661 .13178 -2.46 0.014 -.5819451 -.0653769 year | 1.673509 .1644332 10.18 0.000 1.351226 1.995792 yearsq | -.1213811 .0141244 -8.59 0.000 -.1490643 -.0936978 select | .1482849 .0458009 3.24 0.001 .0585167 .2380531 prestige | -.4106689 .0871326 -4.71 0.000 -.5814458 -.2398921 _cons | -8.659509 .6082361 -14.24 0.000 -9.85163 -7.467388 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 411.29 Prob > chi2 = 0.0000 Log likelihood = -837.18504 Pseudo R2 = 0.1972 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root6 | 1.990187 .3344517 5.95 0.000 1.334674 2.6457 female | -.3211841 .1315504 -2.44 0.015 -.5790181 -.0633501 year | 1.680002 .1642826 10.23 0.000 1.358014 2.00199 yearsq | -.121713 .0141061 -8.63 0.000 -.1493605 -.0940656 select | .1468831 .0456826 3.22 0.001 .0573468 .2364194 prestige | -.3984683 .0867579 -4.59 0.000 -.5685106 -.228426 _cons | -8.948241 .6454061 -13.86 0.000 -10.21321 -7.683268 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 406.78 Prob > chi2 = 0.0000 Log likelihood = -839.43831 Pseudo R2 = 0.1950 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root7 | 2.213966 .4023596 5.50 0.000 1.425356 3.002577 female | -.3192306 .1313433 -2.43 0.015 -.5766587 -.0618026 year | 1.687177 .1641465 10.28 0.000 1.365456 2.008898 yearsq | -.1221061 .0140899 -8.67 0.000 -.1497218 -.0944903 select | .1456999 .045578 3.20 0.001 .0563688 .2350311 prestige | -.3868973 .0864281 -4.48 0.000 -.5562932 -.2175015 _cons | -9.161927 .6856449 -13.36 0.000 -10.50577 -7.818087 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 402.59 Prob > chi2 = 0.0000 Log likelihood = -841.53528 Pseudo R2 = 0.1930 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root8 | 2.370269 .466713 5.08 0.000 1.455528 3.285009 female | -.3176208 .1311588 -2.42 0.015 -.5746874 -.0605543 year | 1.694679 .1640258 10.33 0.000 1.373195 2.016164 yearsq | -.1225323 .0140758 -8.71 0.000 -.1501203 -.0949442 select | .1446879 .0454856 3.18 0.001 .0555377 .233838 prestige | -.3759743 .0861289 -4.37 0.000 -.5447837 -.2071648 _cons | -9.30613 .7268057 -12.80 0.000 -10.73064 -7.881617 ------------------------------------------------------------------------------ Logistic regression Number of obs = 2797 LR chi2(6) = 398.76 Prob > chi2 = 0.0000 Log likelihood = -843.45036 Pseudo R2 = 0.1912 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root9 | 2.463891 .5259642 4.68 0.000 1.43302 3.494762 female | -.3162621 .1309963 -2.41 0.016 -.5730101 -.0595141 year | 1.702306 .163918 10.39 0.000 1.381032 2.023579 yearsq | -.1229755 .0140635 -8.74 0.000 -.1505394 -.0954116 select | .1438176 .0454046 3.17 0.002 .0548263 .232809 prestige | -.3657016 .08585 -4.26 0.000 -.5339646 -.1974386 _cons | -9.38709 .766672 -12.24 0.000 -10.88974 -7.884441 ------------------------------------------------------------------------------ . . // #5 . // loop through models with a description . . foreach avar in `artvars' { 2. display _new "== logit with `avar'" 3. logit tenure `avar' female year yearsq select prestige, nolog 4. } == logit with art_root1 Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root1 | .0548251 .008573 6.40 0.000 .0380224 .0716278 female | -.3537514 .1320848 -2.68 0.007 -.6126327 -.09487 year | 1.723243 .1638441 10.52 0.000 1.402114 2.044371 yearsq | -.125346 .0141187 -8.88 0.000 -.1530181 -.0976739 select | .1544061 .0460389 3.35 0.001 .0641715 .2446407 prestige | -.413615 .0882545 -4.69 0.000 -.5865907 -.2406394 _cons | -6.812655 .5290562 -12.88 0.000 -7.849586 -5.775724 ------------------------------------------------------------------------------ == logit with art_root2 Logistic regression Number of obs = 2797 LR chi2(6) = 425.01 Prob > chi2 = 0.0000 Log likelihood = -830.32437 Pseudo R2 = 0.2038 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root2 | .4332367 .0576679 7.51 0.000 .3202097 .5462636 female | -.340382 .1324709 -2.57 0.010 -.6000202 -.0807438 year | 1.674869 .1646972 10.17 0.000 1.352068 1.99767 yearsq | -.1219834 .0141757 -8.61 0.000 -.1497674 -.0941995 select | .1545827 .0462078 3.35 0.001 .064017 .2451483 prestige | -.4437035 .088657 -5.00 0.000 -.6174681 -.269939 _cons | -7.307125 .5357716 -13.64 0.000 -8.357218 -6.257032 ------------------------------------------------------------------------------ == logit with art_root3 Logistic regression Number of obs = 2797 LR chi2(6) = 424.42 Prob > chi2 = 0.0000 Log likelihood = -830.61834 Pseudo R2 = 0.2035 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root3 | .8999652 .1232629 7.30 0.000 .6583744 1.141556 female | -.3320245 .1322814 -2.51 0.012 -.5912913 -.0727578 year | 1.666784 .1647204 10.12 0.000 1.343938 1.989629 yearsq | -.12122 .0141645 -8.56 0.000 -.1489818 -.0934581 select | .1520817 .0460792 3.30 0.001 .061768 .2423953 prestige | -.4357212 .0880958 -4.95 0.000 -.6083859 -.2630566 _cons | -7.837491 .5514876 -14.21 0.000 -8.918386 -6.756595 ------------------------------------------------------------------------------ == logit with art_root4 Logistic regression Number of obs = 2797 LR chi2(6) = 420.61 Prob > chi2 = 0.0000 Log likelihood = -832.52554 Pseudo R2 = 0.2017 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root4 | 1.330429 .193361 6.88 0.000 .9514482 1.709409 female | -.3270201 .1320287 -2.48 0.013 -.5857916 -.0682486 year | 1.668452 .16459 10.14 0.000 1.345861 1.991042 yearsq | -.1211684 .0141445 -8.57 0.000 -.148891 -.0934458 select | .1499813 .045934 3.27 0.001 .0599523 .2400103 prestige | -.4233462 .087572 -4.83 0.000 -.5949842 -.2517083 _cons | -8.290543 .5762453 -14.39 0.000 -9.419963 -7.161123 ------------------------------------------------------------------------------ == logit with art_root5 Logistic regression Number of obs = 2797 LR chi2(6) = 415.99 Prob > chi2 = 0.0000 Log likelihood = -834.83473 Pseudo R2 = 0.1995 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root5 | 1.695841 .2643407 6.42 0.000 1.177743 2.213939 female | -.323661 .13178 -2.46 0.014 -.5819451 -.0653769 year | 1.673509 .1644332 10.18 0.000 1.351226 1.995792 yearsq | -.1213811 .0141244 -8.59 0.000 -.1490643 -.0936978 select | .1482849 .0458009 3.24 0.001 .0585167 .2380531 prestige | -.4106689 .0871326 -4.71 0.000 -.5814458 -.2398921 _cons | -8.659509 .6082361 -14.24 0.000 -9.85163 -7.467388 ------------------------------------------------------------------------------ == logit with art_root6 Logistic regression Number of obs = 2797 LR chi2(6) = 411.29 Prob > chi2 = 0.0000 Log likelihood = -837.18504 Pseudo R2 = 0.1972 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root6 | 1.990187 .3344517 5.95 0.000 1.334674 2.6457 female | -.3211841 .1315504 -2.44 0.015 -.5790181 -.0633501 year | 1.680002 .1642826 10.23 0.000 1.358014 2.00199 yearsq | -.121713 .0141061 -8.63 0.000 -.1493605 -.0940656 select | .1468831 .0456826 3.22 0.001 .0573468 .2364194 prestige | -.3984683 .0867579 -4.59 0.000 -.5685106 -.228426 _cons | -8.948241 .6454061 -13.86 0.000 -10.21321 -7.683268 ------------------------------------------------------------------------------ == logit with art_root7 Logistic regression Number of obs = 2797 LR chi2(6) = 406.78 Prob > chi2 = 0.0000 Log likelihood = -839.43831 Pseudo R2 = 0.1950 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root7 | 2.213966 .4023596 5.50 0.000 1.425356 3.002577 female | -.3192306 .1313433 -2.43 0.015 -.5766587 -.0618026 year | 1.687177 .1641465 10.28 0.000 1.365456 2.008898 yearsq | -.1221061 .0140899 -8.67 0.000 -.1497218 -.0944903 select | .1456999 .045578 3.20 0.001 .0563688 .2350311 prestige | -.3868973 .0864281 -4.48 0.000 -.5562932 -.2175015 _cons | -9.161927 .6856449 -13.36 0.000 -10.50577 -7.818087 ------------------------------------------------------------------------------ == logit with art_root8 Logistic regression Number of obs = 2797 LR chi2(6) = 402.59 Prob > chi2 = 0.0000 Log likelihood = -841.53528 Pseudo R2 = 0.1930 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root8 | 2.370269 .466713 5.08 0.000 1.455528 3.285009 female | -.3176208 .1311588 -2.42 0.015 -.5746874 -.0605543 year | 1.694679 .1640258 10.33 0.000 1.373195 2.016164 yearsq | -.1225323 .0140758 -8.71 0.000 -.1501203 -.0949442 select | .1446879 .0454856 3.18 0.001 .0555377 .233838 prestige | -.3759743 .0861289 -4.37 0.000 -.5447837 -.2071648 _cons | -9.30613 .7268057 -12.80 0.000 -10.73064 -7.881617 ------------------------------------------------------------------------------ == logit with art_root9 Logistic regression Number of obs = 2797 LR chi2(6) = 398.76 Prob > chi2 = 0.0000 Log likelihood = -843.45036 Pseudo R2 = 0.1912 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root9 | 2.463891 .5259642 4.68 0.000 1.43302 3.494762 female | -.3162621 .1309963 -2.41 0.016 -.5730101 -.0595141 year | 1.702306 .163918 10.39 0.000 1.381032 2.023579 yearsq | -.1229755 .0140635 -8.74 0.000 -.1505394 -.0954116 select | .1438176 .0454046 3.17 0.002 .0548263 .232809 prestige | -.3657016 .08585 -4.26 0.000 -.5339646 -.1974386 _cons | -9.38709 .766672 -12.24 0.000 -10.88974 -7.884441 ------------------------------------------------------------------------------ . . log close log: D:\wf\work\wf7-loops-arttran.log log type: text closed on: 24 Oct 2008, 09:41:50 -------------------------------------------------------------------------------- . exit end of do-file . . * matrices to hold results . do wf7-matrix-ttest.do . capture log close . log using wf7-matrix-ttest, replace text (note: file D:\wf\work\wf7-matrix-ttest.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-matrix-ttest.log log type: text opened on: 24 Oct 2008, 09:41:50 . . // pgm: wf7-matrix-ttest.do \ for stata 9 . // task: use matrix to collect results from ttest . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // ttest of gender differences w/o matrices . . local varlist "tenure year select articles prestige" . foreach var in `varlist' { 2. display _new ". ttest `var', by(female)" 3. ttest `var', by(female) 4. } . ttest tenure, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 .1315336 .0081025 .3380801 .1156419 .1474253 1_Female | 1056 .1089015 .0095908 .3116632 .0900824 .1277207 ---------+-------------------------------------------------------------------- combined | 2797 .1229889 .0062111 .3284832 .1108102 .1351677 ---------+-------------------------------------------------------------------- diff | .0226321 .0128075 -.002481 .0477451 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = 1.7671 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.9613 Pr(|T| > |t|) = 0.0773 Pr(T > t) = 0.0387 . ttest year, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 3.784032 .0539732 2.252048 3.678173 3.889891 1_Female | 1056 3.974432 .0732539 2.380471 3.830692 4.118172 ---------+-------------------------------------------------------------------- combined | 2797 3.855917 .0435423 2.302805 3.770539 3.941295 ---------+-------------------------------------------------------------------- diff | -.1903997 .0897636 -.3664094 -.01439 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.1211 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0170 Pr(|T| > |t|) = 0.0340 Pr(T > t) = 0.9830 . ttest select, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 4.991528 .0327213 1.365305 4.927351 5.055705 1_Female | 1056 5.000852 .0453902 1.475006 4.911787 5.089918 ---------+-------------------------------------------------------------------- combined | 2797 4.995048 .026613 1.407473 4.942865 5.047231 ---------+-------------------------------------------------------------------- diff | -.0093244 .0549074 -.1169875 .0983386 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.1698 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.4326 Pr(|T| > |t|) = 0.8652 Pr(T > t) = 0.5674 . ttest articles, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 6.829408 .1435568 5.98995 6.547846 7.11097 1_Female | 1056 7.414773 .2286447 7.430074 6.966123 7.863423 ---------+-------------------------------------------------------------------- combined | 2797 7.050411 .1243353 6.575682 6.806613 7.294209 ---------+-------------------------------------------------------------------- diff | -.5853643 .2562881 -1.087897 -.0828313 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -2.2840 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0112 Pr(|T| > |t|) = 0.0224 Pr(T > t) = 0.9888 . ttest prestige, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 2.639584 .0187939 .7841798 2.602723 2.676445 1_Female | 1056 2.658144 .0235465 .7651707 2.611941 2.704347 ---------+-------------------------------------------------------------------- combined | 2797 2.646591 .0146913 .7769724 2.617784 2.675398 ---------+-------------------------------------------------------------------- diff | -.0185604 .0303088 -.0779903 .0408696 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = -0.6124 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.2702 Pr(|T| > |t|) = 0.5403 Pr(T > t) = 0.7298 . . // #3 . // create the matrix (see below for a fancier method) . . matrix stats = J(5,6,-99) . matrix list stats stats[5,6] c1 c2 c3 c4 c5 c6 r1 -99 -99 -99 -99 -99 -99 r2 -99 -99 -99 -99 -99 -99 r3 -99 -99 -99 -99 -99 -99 r4 -99 -99 -99 -99 -99 -99 r5 -99 -99 -99 -99 -99 -99 . * add row and column names . matrix colnames stats = FemMn FemSD MalMn MalSD t_test t_prob . matrix rownames stats = `varlist' . matrix list stats stats[5,6] FemMn FemSD MalMn MalSD t_test t_prob tenure -99 -99 -99 -99 -99 -99 year -99 -99 -99 -99 -99 -99 select -99 -99 -99 -99 -99 -99 articles -99 -99 -99 -99 -99 -99 prestige -99 -99 -99 -99 -99 -99 . . // #4 . // examine what ttest returns . . ttest tenure, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0_Male | 1741 .1315336 .0081025 .3380801 .1156419 .1474253 1_Female | 1056 .1089015 .0095908 .3116632 .0900824 .1277207 ---------+-------------------------------------------------------------------- combined | 2797 .1229889 .0062111 .3284832 .1108102 .1351677 ---------+-------------------------------------------------------------------- diff | .0226321 .0128075 -.002481 .0477451 ------------------------------------------------------------------------------ diff = mean(0_Male) - mean(1_Female) t = 1.7671 Ho: diff = 0 degrees of freedom = 2795 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.9613 Pr(|T| > |t|) = 0.0773 Pr(T > t) = 0.0387 . return list scalars: r(sd) = .3284832119751412 r(sd_2) = .3116632125613366 r(sd_1) = .3380801147013905 r(se) = .0128074679461748 r(p_u) = .0386602339087829 r(p_l) = .9613397660912171 r(p) = .0773204678175659 r(t) = 1.767100751070975 r(df_t) = 2795 r(mu_2) = .1089015151515152 r(N_2) = 1056 r(mu_1) = .1315336013785181 r(N_1) = 1741 . . // #5 . // collect t-test results in matrix . . local irow = 0 . foreach var of varlist `varlist' { 2. local ++irow 3. qui ttest `var', by(female) 4. matrix stats[`irow',1] = r(mu_2) // female mean 5. matrix stats[`irow',2] = r(sd_2) // female sd 6. matrix stats[`irow',3] = r(mu_1) // male mean 7. matrix stats[`irow',4] = r(sd_1) // male sd 8. matrix stats[`irow',5] = r(t) // t-value 9. matrix stats[`irow',6] = r(p) // p-value 10. } . . // #6 . // ways to list results . . * the easiest way . matrix list stats stats[5,6] FemMn FemSD MalMn MalSD t_test tenure .10890152 .31166321 .1315336 .33808011 1.7671008 year 3.9744318 2.3804714 3.7840322 2.2520484 -2.1211225 select 5.0008522 1.4750064 4.9915278 1.3653054 -.16982095 articles 7.4147727 7.430074 6.8294084 5.9899501 -2.2840091 prestige 2.6581439 .76517075 2.6395836 .7841798 -.61237549 t_prob tenure .07732047 year .0339993 select .86516324 articles .02244567 prestige .54033917 . . * creating a header . local n_men = r(N_1) . local n_women = r(N_2) . local header "t-tests: mean_women (N=`n_women') = mean_men (N=`n_men')" . . * alternative formats . matrix list stats, format(%9.3f) stats[5,6] FemMn FemSD MalMn MalSD t_test t_prob tenure 0.109 0.312 0.132 0.338 1.767 0.077 year 3.974 2.380 3.784 2.252 -2.121 0.034 select 5.001 1.475 4.992 1.365 -0.170 0.865 articles 7.415 7.430 6.829 5.990 -2.284 0.022 prestige 2.658 0.765 2.640 0.784 -0.612 0.540 . matrix list stats, format(%9.3f) title(`header') stats[5,6]: t-tests: mean_women (N=1056) = mean_men (N=1741) FemMn FemSD MalMn MalSD t_test t_prob tenure 0.109 0.312 0.132 0.338 1.767 0.077 year 3.974 2.380 3.784 2.252 -2.121 0.034 select 5.001 1.475 4.992 1.365 -0.170 0.865 articles 7.415 7.430 6.829 5.990 -2.284 0.022 prestige 2.658 0.765 2.640 0.784 -0.612 0.540 . matrix list stats, format(%9.2f) title(`header') stats[5,6]: t-tests: mean_women (N=1056) = mean_men (N=1741) FemMn FemSD MalMn MalSD t_test t_prob tenure 0.11 0.31 0.13 0.34 1.77 0.08 year 3.97 2.38 3.78 2.25 -2.12 0.03 select 5.00 1.48 4.99 1.37 -0.17 0.87 articles 7.41 7.43 6.83 5.99 -2.28 0.02 prestige 2.66 0.77 2.64 0.78 -0.61 0.54 . . // #7 . // a fancier way to create a matrix . . local nvars : word count `varlist' . matrix stats2 = J(`nvars',6,-99) . matrix colnames stats2 = FemMn FemSD MalMn MalSD t_test t_prob . matrix rownames stats2 = `varlist' . matrix list stats2 stats2[5,6] FemMn FemSD MalMn MalSD t_test t_prob tenure -99 -99 -99 -99 -99 -99 year -99 -99 -99 -99 -99 -99 select -99 -99 -99 -99 -99 -99 articles -99 -99 -99 -99 -99 -99 prestige -99 -99 -99 -99 -99 -99 . . log close log: D:\wf\work\wf7-matrix-ttest.log log type: text closed on: 24 Oct 2008, 09:41:50 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-matrix-nested.do . capture log close . log using wf7-matrix-nested, replace text (note: file D:\wf\work\wf7-matrix-nested.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-matrix-nested.log log type: text opened on: 24 Oct 2008, 09:41:50 . . // pgm: wf7-matrix-nested.do \ for stata 9 . // task: use matrix to collect results from nested models . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // define groups of variables . . local Vtime "year yearsq" // time in rank . local Vdept "select prestige" // characteristics of departments . local Vprod "articles" // research productivity . . // #3 . // set up matrix for results . . local modelnm "base plustime plusdept plusprod" . local statsnm "ORfemale zfemale BIC" . matrix stats = J(4,3,-99) . matrix rownames stats = `modelnm' . matrix colnames stats = `statsnm' . matrix list stats stats[4,3] ORfemale zfemale BIC base -99 -99 -99 plustime -99 -99 -99 plusdept -99 -99 -99 plusprod -99 -99 -99 . . // #4 . // nested models predicting tenure . . // #4a - baseline gender only model . . logit tenure female, or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -1041.2482 Iteration 2: log likelihood = -1041.2452 Logistic regression Number of obs = 2797 LR chi2(1) = 3.17 Prob > chi2 = 0.0752 Log likelihood = -1041.2452 Pseudo R2 = 0.0015 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .8069089 .0981196 -1.76 0.078 .6357978 1.024071 ------------------------------------------------------------------------------ . matrix b = e(b) // get betas . matrix list b b[1,2] female _cons y1 -.21454446 -1.8874666 . matrix v = e(V) // get covariance of betas . matrix list v symmetric v[2,2] female _cons female .01478639 _cons -.00502818 .00502818 . * put results in matrix . matrix stats[1,1] = exp(b[1,1]) // compute OR for female . matrix stats[1,2] = b[1,1]/sqrt(v[1,1]) // compute z . estat ic // get BIC ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -1041.245 2 2086.49 2098.363 ------------------------------------------------------------------------------ . matrix temp = r(S) . matrix stats[1,3] = temp[1,6] . . // #4b + time . . logit tenure female `Vtime', or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -904.40022 Iteration 2: log likelihood = -872.58561 Iteration 3: log likelihood = -868.59384 Iteration 4: log likelihood = -868.46499 Iteration 5: log likelihood = -868.46481 Logistic regression Number of obs = 2797 LR chi2(3) = 348.73 Prob > chi2 = 0.0000 Log likelihood = -868.46481 Pseudo R2 = 0.1672 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7231118 .0933734 -2.51 0.012 .5614256 .9313624 year | 6.079157 .9791881 11.21 0.000 4.433409 8.335832 yearsq | .8785368 .0121689 -9.35 0.000 .8550071 .9027141 ------------------------------------------------------------------------------ . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[2,1] = exp(b[1,1]) // compute OR for female . matrix stats[2,2] = b[1,1]/sqrt(v[1,1]) // compute z . estat ic // get BIC ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -868.4648 4 1744.93 1768.675 ------------------------------------------------------------------------------ . matrix temp = r(S) . matrix stats[2,3] = temp[1,6] . . // #4c + department . . logit tenure female `Vtime' `Vdept', or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -897.88871 Iteration 2: log likelihood = -864.3311 Iteration 3: log likelihood = -860.0445 Iteration 4: log likelihood = -859.89765 Iteration 5: log likelihood = -859.89742 Logistic regression Number of obs = 2797 LR chi2(5) = 365.86 Prob > chi2 = 0.0000 Log likelihood = -859.89742 Pseudo R2 = 0.1754 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7208627 .0936383 -2.52 0.012 .5588349 .9298686 year | 6.161345 .9978713 11.23 0.000 4.485571 8.463176 yearsq | .8778896 .0122234 -9.35 0.000 .8542561 .902177 select | 1.151231 .0519671 3.12 0.002 1.053753 1.257726 prestige | .7697678 .0639568 -3.15 0.002 .6540891 .9059049 ------------------------------------------------------------------------------ . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[3,1] = exp(b[1,1]) // compute OR for female . matrix stats[3,2] = b[1,1]/sqrt(v[1,1]) // compute z . estat ic // get BIC ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -859.8974 6 1731.795 1767.413 ------------------------------------------------------------------------------ . matrix temp = r(S) . matrix stats[3,3] = temp[1,6] . . // #4d + time . . logit tenure female `Vtime' `Vdept' `Vprod', or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -884.11858 Iteration 2: log likelihood = -843.44392 Iteration 3: log likelihood = -838.71299 Iteration 4: log likelihood = -838.53328 Iteration 5: log likelihood = -838.53294 Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7020495 .09273 -2.68 0.007 .5419223 .9094912 year | 5.602668 .9179641 10.52 0.000 4.063783 7.724302 yearsq | .8821916 .0124554 -8.88 0.000 .8581142 .9069446 select | 1.166965 .0537258 3.35 0.001 1.066275 1.277162 prestige | .6612555 .0583588 -4.69 0.000 .5562204 .7861251 articles | 1.056356 .0090561 6.40 0.000 1.038754 1.074255 ------------------------------------------------------------------------------ . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[4,1] = exp(b[1,1]) // compute OR for female . matrix stats[4,2] = b[1,1]/sqrt(v[1,1]) // compute z . estat ic // get BIC ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -838.5329 7 1691.066 1732.62 ------------------------------------------------------------------------------ . matrix temp = r(S) . matrix stats[4,3] = temp[1,6] . . // #5 . // print results . . matrix list stats, format(%9.3f) stats[4,3] ORfemale zfemale BIC base 0.807 -1.764 2098.363 plustime 0.723 -2.511 1768.675 plusdept 0.721 -2.520 1767.413 plusprod 0.702 -2.678 1732.620 . . log close log: D:\wf\work\wf7-matrix-nested.log log type: text closed on: 24 Oct 2008, 09:41:51 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-matrix-nested-include.do . capture log close . log using wf7-matrix-nested-include, replace text (note: file D:\wf\work\wf7-matrix-nested-include.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-matrix-nested-include.log log type: text opened on: 24 Oct 2008, 09:41:51 . . // pgm: wf7-matrix-nested-include.do \ for stata 9 . // include: requires wf7-matrix-nested-include.doi . // task: use matrix to collect results from nested models . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // define groups of variables . . local Vtime "year yearsq" // time in rank . local Vdept "select prestige" // characteristics of department affiliations . local Vprod "articles" // research productivity . . // #3 . // set up matrix for results . . local modelnm "base plustime plusdept plusprod" . local statsnm "ORfemale zfemale BIC" . matrix stats = J(4,3,-99) . matrix rownames stats = `modelnm' . matrix colnames stats = `statsnm' . matrix list stats stats[4,3] ORfemale zfemale BIC base -99 -99 -99 plustime -99 -99 -99 plusdept -99 -99 -99 plusprod -99 -99 -99 . . // #4 . // nested models predicting tenure . . // #4a - baseline gender only model . . logit tenure female, or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -1041.2482 Iteration 2: log likelihood = -1041.2452 Logistic regression Number of obs = 2797 LR chi2(1) = 3.17 Prob > chi2 = 0.0752 Log likelihood = -1041.2452 Pseudo R2 = 0.0015 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .8069089 .0981196 -1.76 0.078 .6357978 1.024071 ------------------------------------------------------------------------------ . include wf7-matrix-nested-include.doi . // include: wf7-matrix-nested-include.doi . // used by: wf7-matrix-nested-include.do \ for stata 9 . // task: compute OR, z-test, BIC . // project: workflow chapter . // author: scott long \ 2008-10-24 . . // note: irow does not need to be defined the 1st time the file is . // called. Local will be default be a null string treated as 0. . . local irow = `irow' + 1 . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[`irow',1] = exp(b[1,1]) // compute OR for female . matrix stats[`irow',2] = b[1,1]/sqrt(v[1,1]) // compute z . quietly estat ic // get BIC . matrix temp = r(S) . matrix stats[`irow',3] = temp[1,6] . . . // #4b + time . . logit tenure female `Vtime', or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -904.40022 Iteration 2: log likelihood = -872.58561 Iteration 3: log likelihood = -868.59384 Iteration 4: log likelihood = -868.46499 Iteration 5: log likelihood = -868.46481 Logistic regression Number of obs = 2797 LR chi2(3) = 348.73 Prob > chi2 = 0.0000 Log likelihood = -868.46481 Pseudo R2 = 0.1672 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7231118 .0933734 -2.51 0.012 .5614256 .9313624 year | 6.079157 .9791881 11.21 0.000 4.433409 8.335832 yearsq | .8785368 .0121689 -9.35 0.000 .8550071 .9027141 ------------------------------------------------------------------------------ . include wf7-matrix-nested-include.doi . // include: wf7-matrix-nested-include.doi . // used by: wf7-matrix-nested-include.do \ for stata 9 . // task: compute OR, z-test, BIC . // project: workflow chapter . // author: scott long \ 2008-10-24 . . // note: irow does not need to be defined the 1st time the file is . // called. Local will be default be a null string treated as 0. . . local irow = `irow' + 1 . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[`irow',1] = exp(b[1,1]) // compute OR for female . matrix stats[`irow',2] = b[1,1]/sqrt(v[1,1]) // compute z . quietly estat ic // get BIC . matrix temp = r(S) . matrix stats[`irow',3] = temp[1,6] . . . // #4c + department . . logit tenure female `Vtime' `Vdept', or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -897.88871 Iteration 2: log likelihood = -864.3311 Iteration 3: log likelihood = -860.0445 Iteration 4: log likelihood = -859.89765 Iteration 5: log likelihood = -859.89742 Logistic regression Number of obs = 2797 LR chi2(5) = 365.86 Prob > chi2 = 0.0000 Log likelihood = -859.89742 Pseudo R2 = 0.1754 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7208627 .0936383 -2.52 0.012 .5588349 .9298686 year | 6.161345 .9978713 11.23 0.000 4.485571 8.463176 yearsq | .8778896 .0122234 -9.35 0.000 .8542561 .902177 select | 1.151231 .0519671 3.12 0.002 1.053753 1.257726 prestige | .7697678 .0639568 -3.15 0.002 .6540891 .9059049 ------------------------------------------------------------------------------ . include wf7-matrix-nested-include.doi . // include: wf7-matrix-nested-include.doi . // used by: wf7-matrix-nested-include.do \ for stata 9 . // task: compute OR, z-test, BIC . // project: workflow chapter . // author: scott long \ 2008-10-24 . . // note: irow does not need to be defined the 1st time the file is . // called. Local will be default be a null string treated as 0. . . local irow = `irow' + 1 . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[`irow',1] = exp(b[1,1]) // compute OR for female . matrix stats[`irow',2] = b[1,1]/sqrt(v[1,1]) // compute z . quietly estat ic // get BIC . matrix temp = r(S) . matrix stats[`irow',3] = temp[1,6] . . . // #4d + time . . logit tenure female `Vtime' `Vdept' `Vprod', or Iteration 0: log likelihood = -1042.8284 Iteration 1: log likelihood = -884.11858 Iteration 2: log likelihood = -843.44392 Iteration 3: log likelihood = -838.71299 Iteration 4: log likelihood = -838.53328 Iteration 5: log likelihood = -838.53294 Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7020495 .09273 -2.68 0.007 .5419223 .9094912 year | 5.602668 .9179641 10.52 0.000 4.063783 7.724302 yearsq | .8821916 .0124554 -8.88 0.000 .8581142 .9069446 select | 1.166965 .0537258 3.35 0.001 1.066275 1.277162 prestige | .6612555 .0583588 -4.69 0.000 .5562204 .7861251 articles | 1.056356 .0090561 6.40 0.000 1.038754 1.074255 ------------------------------------------------------------------------------ . include wf7-matrix-nested-include.doi . // include: wf7-matrix-nested-include.doi . // used by: wf7-matrix-nested-include.do \ for stata 9 . // task: compute OR, z-test, BIC . // project: workflow chapter . // author: scott long \ 2008-10-24 . . // note: irow does not need to be defined the 1st time the file is . // called. Local will be default be a null string treated as 0. . . local irow = `irow' + 1 . matrix b = e(b) // get betas . matrix v = e(V) // get covariance of betas . matrix stats[`irow',1] = exp(b[1,1]) // compute OR for female . matrix stats[`irow',2] = b[1,1]/sqrt(v[1,1]) // compute z . quietly estat ic // get BIC . matrix temp = r(S) . matrix stats[`irow',3] = temp[1,6] . . . // #5 . // print results . . matrix list stats, format(%9.3f) stats[4,3] ORfemale zfemale BIC base 0.807 -1.764 2098.363 plustime 0.723 -2.511 1768.675 plusdept 0.721 -2.520 1767.413 plusprod 0.702 -2.678 1732.620 . . log close log: D:\wf\work\wf7-matrix-nested-include.log log type: text closed on: 24 Oct 2008, 09:41:51 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-matrix-arttran.do . capture log close . log using wf7-matrix-arttran, replace text (note: file D:\wf\work\wf7-matrix-arttran.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-matrix-arttran.log log type: text opened on: 24 Oct 2008, 09:41:51 . . // pgm: wf7-matrix-arttran.do \ for stata 9 . // task: use matrix to collect results from different transformations . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // create art_root# = (articles)^(1/#) . . local artvars "" // list of variables to test . . forvalues root = 1(1)9 { // loop through each root 2. * take to the 1/root power . gen art_root`root' = articles^(1/`root') 3. label var art_root`root' "articles^(1/`root')" 4. * add new variable to the list . local artvars "`artvars' art_root`root'" 5. } . . // #3 . // matrix to hold results . . local nvars : word count `artvars' . matrix stats = J(`nvars',5,-99) . matrix rownames stats = `artvars' . matrix colnames stats = root sd b_std exp_b_std bic . . // #4 . // loop through models . . local irow = 0 . foreach avar in `artvars' { 2. local ++irow 3. * add root number to the matrix . matrix stats[`irow',1] = `irow' 4. * sd of avar . sum `avar' 5. local sd = r(sd) 6. matrix stats[`irow',2] = `sd' 7. * logit with avar . logit tenure `avar' female year yearsq select prestige, nolog 8. * save b*sd and exp(b*sd) for avar . matrix temp = e(b) 9. matrix stats[`irow',3] = temp[1,1]*`sd' 10. matrix stats[`irow',4] = exp(temp[1,1]*`sd') 11. * save bic . estat ic 12. matrix temp = r(S) 13. matrix stats[`irow',5] = temp[1,6] 14. } Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root1 | 2797 7.050411 6.575682 0 73 Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root1 | .0548251 .008573 6.40 0.000 .0380224 .0716278 female | -.3537514 .1320848 -2.68 0.007 -.6126327 -.09487 year | 1.723243 .1638441 10.52 0.000 1.402114 2.044371 yearsq | -.125346 .0141187 -8.88 0.000 -.1530181 -.0976739 select | .1544061 .0460389 3.35 0.001 .0641715 .2446407 prestige | -.413615 .0882545 -4.69 0.000 -.5865907 -.2406394 _cons | -6.812655 .5290562 -12.88 0.000 -7.849586 -5.775724 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -838.5329 7 1691.066 1732.62 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root2 | 2797 2.383072 1.171269 0 8.544003 Logistic regression Number of obs = 2797 LR chi2(6) = 425.01 Prob > chi2 = 0.0000 Log likelihood = -830.32437 Pseudo R2 = 0.2038 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root2 | .4332367 .0576679 7.51 0.000 .3202097 .5462636 female | -.340382 .1324709 -2.57 0.010 -.6000202 -.0807438 year | 1.674869 .1646972 10.17 0.000 1.352068 1.99767 yearsq | -.1219834 .0141757 -8.61 0.000 -.1497674 -.0941995 select | .1545827 .0462078 3.35 0.001 .064017 .2451483 prestige | -.4437035 .088657 -5.00 0.000 -.6174681 -.269939 _cons | -7.307125 .5357716 -13.64 0.000 -8.357218 -6.257032 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -830.3244 7 1674.649 1716.203 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root3 | 2797 1.717082 .6459267 0 4.179339 Logistic regression Number of obs = 2797 LR chi2(6) = 424.42 Prob > chi2 = 0.0000 Log likelihood = -830.61834 Pseudo R2 = 0.2035 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root3 | .8999652 .1232629 7.30 0.000 .6583744 1.141556 female | -.3320245 .1322814 -2.51 0.012 -.5912913 -.0727578 year | 1.666784 .1647204 10.12 0.000 1.343938 1.989629 yearsq | -.12122 .0141645 -8.56 0.000 -.1489818 -.0934581 select | .1520817 .0460792 3.30 0.001 .061768 .2423953 prestige | -.4357212 .0880958 -4.95 0.000 -.6083859 -.2630566 _cons | -7.837491 .5514876 -14.21 0.000 -8.918386 -6.756595 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -830.6183 7 1675.237 1716.791 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root4 | 2797 1.467463 .4792778 0 2.923013 Logistic regression Number of obs = 2797 LR chi2(6) = 420.61 Prob > chi2 = 0.0000 Log likelihood = -832.52554 Pseudo R2 = 0.2017 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root4 | 1.330429 .193361 6.88 0.000 .9514482 1.709409 female | -.3270201 .1320287 -2.48 0.013 -.5857916 -.0682486 year | 1.668452 .16459 10.14 0.000 1.345861 1.991042 yearsq | -.1211684 .0141445 -8.57 0.000 -.148891 -.0934458 select | .1499813 .045934 3.27 0.001 .0599523 .2400103 prestige | -.4233462 .087572 -4.83 0.000 -.5949842 -.2517083 _cons | -8.290543 .5762453 -14.39 0.000 -9.419963 -7.161123 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -832.5255 7 1679.051 1720.605 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root5 | 2797 1.338454 .4025341 0 2.358656 Logistic regression Number of obs = 2797 LR chi2(6) = 415.99 Prob > chi2 = 0.0000 Log likelihood = -834.83473 Pseudo R2 = 0.1995 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root5 | 1.695841 .2643407 6.42 0.000 1.177743 2.213939 female | -.323661 .13178 -2.46 0.014 -.5819451 -.0653769 year | 1.673509 .1644332 10.18 0.000 1.351226 1.995792 yearsq | -.1213811 .0141244 -8.59 0.000 -.1490643 -.0936978 select | .1482849 .0458009 3.24 0.001 .0585167 .2380531 prestige | -.4106689 .0871326 -4.71 0.000 -.5814458 -.2398921 _cons | -8.659509 .6082361 -14.24 0.000 -9.85163 -7.467388 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -834.8347 7 1683.669 1725.224 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root6 | 2797 1.260004 .3598848 0 2.044343 Logistic regression Number of obs = 2797 LR chi2(6) = 411.29 Prob > chi2 = 0.0000 Log likelihood = -837.18504 Pseudo R2 = 0.1972 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root6 | 1.990187 .3344517 5.95 0.000 1.334674 2.6457 female | -.3211841 .1315504 -2.44 0.015 -.5790181 -.0633501 year | 1.680002 .1642826 10.23 0.000 1.358014 2.00199 yearsq | -.121713 .0141061 -8.63 0.000 -.1493605 -.0940656 select | .1468831 .0456826 3.22 0.001 .0573468 .2364194 prestige | -.3984683 .0867579 -4.59 0.000 -.5685106 -.228426 _cons | -8.948241 .6454061 -13.86 0.000 -10.21321 -7.683268 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -837.185 7 1688.37 1729.924 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root7 | 2797 1.207364 .333293 0 1.845818 Logistic regression Number of obs = 2797 LR chi2(6) = 406.78 Prob > chi2 = 0.0000 Log likelihood = -839.43831 Pseudo R2 = 0.1950 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root7 | 2.213966 .4023596 5.50 0.000 1.425356 3.002577 female | -.3192306 .1313433 -2.43 0.015 -.5766587 -.0618026 year | 1.687177 .1641465 10.28 0.000 1.365456 2.008898 yearsq | -.1221061 .0140899 -8.67 0.000 -.1497218 -.0944903 select | .1456999 .045578 3.20 0.001 .0563688 .2350311 prestige | -.3868973 .0864281 -4.48 0.000 -.5562932 -.2175015 _cons | -9.161927 .6856449 -13.36 0.000 -10.50577 -7.818087 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -839.4383 7 1692.877 1734.431 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root8 | 2797 1.169634 .3153645 0 1.709682 Logistic regression Number of obs = 2797 LR chi2(6) = 402.59 Prob > chi2 = 0.0000 Log likelihood = -841.53528 Pseudo R2 = 0.1930 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root8 | 2.370269 .466713 5.08 0.000 1.455528 3.285009 female | -.3176208 .1311588 -2.42 0.015 -.5746874 -.0605543 year | 1.694679 .1640258 10.33 0.000 1.373195 2.016164 yearsq | -.1225323 .0140758 -8.71 0.000 -.1501203 -.0949442 select | .1446879 .0454856 3.18 0.001 .0555377 .233838 prestige | -.3759743 .0861289 -4.37 0.000 -.5447837 -.2071648 _cons | -9.30613 .7268057 -12.80 0.000 -10.73064 -7.881617 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -841.5353 7 1697.071 1738.625 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root9 | 2797 1.14128 .3025719 0 1.610779 Logistic regression Number of obs = 2797 LR chi2(6) = 398.76 Prob > chi2 = 0.0000 Log likelihood = -843.45036 Pseudo R2 = 0.1912 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root9 | 2.463891 .5259642 4.68 0.000 1.43302 3.494762 female | -.3162621 .1309963 -2.41 0.016 -.5730101 -.0595141 year | 1.702306 .163918 10.39 0.000 1.381032 2.023579 yearsq | -.1229755 .0140635 -8.74 0.000 -.1505394 -.0954116 select | .1438176 .0454046 3.17 0.002 .0548263 .232809 prestige | -.3657016 .08585 -4.26 0.000 -.5339646 -.1974386 _cons | -9.38709 .766672 -12.24 0.000 -10.88974 -7.884441 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -843.4504 7 1700.901 1742.455 ------------------------------------------------------------------------------ . . // #5 . // print summary of results . . local header "Comparing root transformations of articles in logit" . * NOTE: model also includes female year yearsq select prestige . matrix list stats, format(%9.3f) title(`header') stats[9,5]: Comparing root transformations of articles in logit root sd b_std exp_b_std bic art_root1 1.000 6.576 0.361 1.434 1732.620 art_root2 2.000 1.171 0.507 1.661 1716.203 art_root3 3.000 0.646 0.581 1.788 1716.791 art_root4 4.000 0.479 0.638 1.892 1720.605 art_root5 5.000 0.403 0.683 1.979 1725.224 art_root6 6.000 0.360 0.716 2.047 1729.924 art_root7 7.000 0.333 0.738 2.092 1734.431 art_root8 8.000 0.315 0.747 2.112 1738.625 art_root9 9.000 0.303 0.746 2.108 1742.455 . . // #6 . // add z-test and prob . . * create the matrix . local nvars : word count `artvars' . matrix stats = J(`nvars',7,-99) . matrix rownames stats = `artvars' . matrix colnames stats = root sd b_std exp_b_std bic z prob . . * loop through articles . local irow = 0 . foreach avar in `artvars' { 2. local ++irow 3. matrix stats[`irow',1] = `irow' 4. sum `avar' 5. local sd = r(sd) 6. matrix stats[`irow',2] = `sd' 7. qui logit tenure `avar' female year yearsq select prestige, nolog 8. matrix b = e(b) 9. matrix stats[`irow',3] = b[1,1]*`sd' 10. matrix stats[`irow',4] = exp(b[1,1]*`sd') 11. estat ic 12. matrix temp = r(S) 13. matrix stats[`irow',5] = temp[1,6] 14. * compute the z and p . matrix vc = e(V) 15. local ztest = b[1,1]/sqrt(vc[1,1]) 16. local prval = 2*normal(-abs(`ztest')) 17. matrix stats[`irow',6] = `ztest' 18. matrix stats[`irow',7] = `prval' 19. } Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root1 | 2797 7.050411 6.575682 0 73 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -838.5329 7 1691.066 1732.62 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root2 | 2797 2.383072 1.171269 0 8.544003 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -830.3244 7 1674.649 1716.203 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root3 | 2797 1.717082 .6459267 0 4.179339 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -830.6183 7 1675.237 1716.791 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root4 | 2797 1.467463 .4792778 0 2.923013 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -832.5255 7 1679.051 1720.605 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root5 | 2797 1.338454 .4025341 0 2.358656 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -834.8347 7 1683.669 1725.224 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root6 | 2797 1.260004 .3598848 0 2.044343 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -837.185 7 1688.37 1729.924 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root7 | 2797 1.207364 .333293 0 1.845818 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -839.4383 7 1692.877 1734.431 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root8 | 2797 1.169634 .3153645 0 1.709682 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -841.5353 7 1697.071 1738.625 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root9 | 2797 1.14128 .3025719 0 1.610779 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -843.4504 7 1700.901 1742.455 ------------------------------------------------------------------------------ . . * print results . . local header "Comparing root transformations of articles in logit" . * NOTE: model also includes female year yearsq select prestige . matrix list stats, format(%9.3f) title(`header') stats[9,7]: Comparing root transformations of articles in logit root sd b_std exp_b_std bic z art_root1 1.000 6.576 0.361 1.434 1732.620 6.395 art_root2 2.000 1.171 0.507 1.661 1716.203 7.513 art_root3 3.000 0.646 0.581 1.788 1716.791 7.301 art_root4 4.000 0.479 0.638 1.892 1720.605 6.881 art_root5 5.000 0.403 0.683 1.979 1725.224 6.415 art_root6 6.000 0.360 0.716 2.047 1729.924 5.951 art_root7 7.000 0.333 0.738 2.092 1734.431 5.502 art_root8 8.000 0.315 0.747 2.112 1738.625 5.079 art_root9 9.000 0.303 0.746 2.108 1742.455 4.685 prob art_root1 0.000 art_root2 0.000 art_root3 0.000 art_root4 0.000 art_root5 0.000 art_root6 0.000 art_root7 0.000 art_root8 0.000 art_root9 0.000 . * add more decimal digits . matrix list stats, format(%12.9f) title(`header') stats[9,7]: Comparing root transformations of articles in logit root sd b_std exp_b_std bic art_root1 1.000000000 6.575681670 0.360512103 1.434063615 1.73262e+03 art_root2 2.000000000 1.171268757 0.507436563 1.661027793 1.71620e+03 art_root3 3.000000000 0.645926712 0.581311590 1.788382518 1.71679e+03 art_root4 4.000000000 0.479277824 0.637645000 1.892019922 1.72061e+03 art_root5 5.000000000 0.402534145 0.682633887 1.979083555 1.72522e+03 art_root6 6.000000000 0.359884819 0.716238036 2.046719026 1.72992e+03 art_root7 7.000000000 0.333292972 0.737899419 2.091537453 1.73443e+03 art_root8 8.000000000 0.315364540 0.747498701 2.111711384 1.73862e+03 art_root9 9.000000000 0.302571868 0.745504078 2.107503513 1.74245e+03 z prob art_root1 6.395111495 0.000000000 art_root2 7.512619634 0.000000000 art_root3 7.301184375 0.000000000 art_root4 6.880543597 0.000000000 art_root5 6.415360888 0.000000000 art_root6 5.950596165 0.000000003 art_root7 5.502456663 0.000000037 art_root8 5.078643075 0.000000380 art_root9 4.684521654 0.000002806 . . log close log: D:\wf\work\wf7-matrix-arttran.log log type: text closed on: 24 Oct 2008, 09:41:52 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-matrix-plot.do . capture log close . log using wf7-matrix-plot, replace text (note: file D:\wf\work\wf7-matrix-plot.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-matrix-plot.log log type: text opened on: 24 Oct 2008, 09:41:52 . . // pgm: wf7-matrix-plot.do \ for stata 9 . // task: plot results collected in a matrix . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . set scheme s2manual . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // create art_root# = (articles)^(1/#) . . local artvars "" // list of variables to test . . forvalues root = 1(1)9 { // loop through each root 2. * take to the 1/root power . gen art_root`root' = articles^(1/`root') 3. label var art_root`root' "articles^(1/`root')" 4. * add new variable to the list . local artvars "`artvars' art_root`root'" 5. } . . // #3 . // matrix to hold results . . local nvars : word count `artvars' . matrix stats = J(`nvars',5,-99) . matrix rownames stats = `artvars' . matrix colnames stats = root sd b_std exp_b_std bic . . // #4 . // loop through models . . local irow = 0 . . foreach avar in `artvars' { 2. . local ++irow 3. * what root is being analyzed? . matrix stats[`irow',1] = `irow' 4. * sd of avar . sum `avar' 5. local sd = r(sd) 6. matrix stats[`irow',2] = `sd' 7. * logit with avar . logit tenure `avar' female year yearsq select prestige, nolog 8. * save b*sd and exp(b*sd) for avar . matrix temp = e(b) 9. matrix stats[`irow',3] = temp[1,1]*`sd' 10. matrix stats[`irow',4] = exp(temp[1,1]*`sd') 11. * save bic . estat ic 12. matrix temp = r(S) 13. matrix stats[`irow',5] = temp[1,6] 14. . } Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root1 | 2797 7.050411 6.575682 0 73 Logistic regression Number of obs = 2797 LR chi2(6) = 408.59 Prob > chi2 = 0.0000 Log likelihood = -838.53294 Pseudo R2 = 0.1959 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root1 | .0548251 .008573 6.40 0.000 .0380224 .0716278 female | -.3537514 .1320848 -2.68 0.007 -.6126327 -.09487 year | 1.723243 .1638441 10.52 0.000 1.402114 2.044371 yearsq | -.125346 .0141187 -8.88 0.000 -.1530181 -.0976739 select | .1544061 .0460389 3.35 0.001 .0641715 .2446407 prestige | -.413615 .0882545 -4.69 0.000 -.5865907 -.2406394 _cons | -6.812655 .5290562 -12.88 0.000 -7.849586 -5.775724 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -838.5329 7 1691.066 1732.62 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root2 | 2797 2.383072 1.171269 0 8.544003 Logistic regression Number of obs = 2797 LR chi2(6) = 425.01 Prob > chi2 = 0.0000 Log likelihood = -830.32437 Pseudo R2 = 0.2038 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root2 | .4332367 .0576679 7.51 0.000 .3202097 .5462636 female | -.340382 .1324709 -2.57 0.010 -.6000202 -.0807438 year | 1.674869 .1646972 10.17 0.000 1.352068 1.99767 yearsq | -.1219834 .0141757 -8.61 0.000 -.1497674 -.0941995 select | .1545827 .0462078 3.35 0.001 .064017 .2451483 prestige | -.4437035 .088657 -5.00 0.000 -.6174681 -.269939 _cons | -7.307125 .5357716 -13.64 0.000 -8.357218 -6.257032 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -830.3244 7 1674.649 1716.203 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root3 | 2797 1.717082 .6459267 0 4.179339 Logistic regression Number of obs = 2797 LR chi2(6) = 424.42 Prob > chi2 = 0.0000 Log likelihood = -830.61834 Pseudo R2 = 0.2035 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root3 | .8999652 .1232629 7.30 0.000 .6583744 1.141556 female | -.3320245 .1322814 -2.51 0.012 -.5912913 -.0727578 year | 1.666784 .1647204 10.12 0.000 1.343938 1.989629 yearsq | -.12122 .0141645 -8.56 0.000 -.1489818 -.0934581 select | .1520817 .0460792 3.30 0.001 .061768 .2423953 prestige | -.4357212 .0880958 -4.95 0.000 -.6083859 -.2630566 _cons | -7.837491 .5514876 -14.21 0.000 -8.918386 -6.756595 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -830.6183 7 1675.237 1716.791 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root4 | 2797 1.467463 .4792778 0 2.923013 Logistic regression Number of obs = 2797 LR chi2(6) = 420.61 Prob > chi2 = 0.0000 Log likelihood = -832.52554 Pseudo R2 = 0.2017 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root4 | 1.330429 .193361 6.88 0.000 .9514482 1.709409 female | -.3270201 .1320287 -2.48 0.013 -.5857916 -.0682486 year | 1.668452 .16459 10.14 0.000 1.345861 1.991042 yearsq | -.1211684 .0141445 -8.57 0.000 -.148891 -.0934458 select | .1499813 .045934 3.27 0.001 .0599523 .2400103 prestige | -.4233462 .087572 -4.83 0.000 -.5949842 -.2517083 _cons | -8.290543 .5762453 -14.39 0.000 -9.419963 -7.161123 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -832.5255 7 1679.051 1720.605 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root5 | 2797 1.338454 .4025341 0 2.358656 Logistic regression Number of obs = 2797 LR chi2(6) = 415.99 Prob > chi2 = 0.0000 Log likelihood = -834.83473 Pseudo R2 = 0.1995 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root5 | 1.695841 .2643407 6.42 0.000 1.177743 2.213939 female | -.323661 .13178 -2.46 0.014 -.5819451 -.0653769 year | 1.673509 .1644332 10.18 0.000 1.351226 1.995792 yearsq | -.1213811 .0141244 -8.59 0.000 -.1490643 -.0936978 select | .1482849 .0458009 3.24 0.001 .0585167 .2380531 prestige | -.4106689 .0871326 -4.71 0.000 -.5814458 -.2398921 _cons | -8.659509 .6082361 -14.24 0.000 -9.85163 -7.467388 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -834.8347 7 1683.669 1725.224 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root6 | 2797 1.260004 .3598848 0 2.044343 Logistic regression Number of obs = 2797 LR chi2(6) = 411.29 Prob > chi2 = 0.0000 Log likelihood = -837.18504 Pseudo R2 = 0.1972 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root6 | 1.990187 .3344517 5.95 0.000 1.334674 2.6457 female | -.3211841 .1315504 -2.44 0.015 -.5790181 -.0633501 year | 1.680002 .1642826 10.23 0.000 1.358014 2.00199 yearsq | -.121713 .0141061 -8.63 0.000 -.1493605 -.0940656 select | .1468831 .0456826 3.22 0.001 .0573468 .2364194 prestige | -.3984683 .0867579 -4.59 0.000 -.5685106 -.228426 _cons | -8.948241 .6454061 -13.86 0.000 -10.21321 -7.683268 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -837.185 7 1688.37 1729.924 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root7 | 2797 1.207364 .333293 0 1.845818 Logistic regression Number of obs = 2797 LR chi2(6) = 406.78 Prob > chi2 = 0.0000 Log likelihood = -839.43831 Pseudo R2 = 0.1950 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root7 | 2.213966 .4023596 5.50 0.000 1.425356 3.002577 female | -.3192306 .1313433 -2.43 0.015 -.5766587 -.0618026 year | 1.687177 .1641465 10.28 0.000 1.365456 2.008898 yearsq | -.1221061 .0140899 -8.67 0.000 -.1497218 -.0944903 select | .1456999 .045578 3.20 0.001 .0563688 .2350311 prestige | -.3868973 .0864281 -4.48 0.000 -.5562932 -.2175015 _cons | -9.161927 .6856449 -13.36 0.000 -10.50577 -7.818087 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -839.4383 7 1692.877 1734.431 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root8 | 2797 1.169634 .3153645 0 1.709682 Logistic regression Number of obs = 2797 LR chi2(6) = 402.59 Prob > chi2 = 0.0000 Log likelihood = -841.53528 Pseudo R2 = 0.1930 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root8 | 2.370269 .466713 5.08 0.000 1.455528 3.285009 female | -.3176208 .1311588 -2.42 0.015 -.5746874 -.0605543 year | 1.694679 .1640258 10.33 0.000 1.373195 2.016164 yearsq | -.1225323 .0140758 -8.71 0.000 -.1501203 -.0949442 select | .1446879 .0454856 3.18 0.001 .0555377 .233838 prestige | -.3759743 .0861289 -4.37 0.000 -.5447837 -.2071648 _cons | -9.30613 .7268057 -12.80 0.000 -10.73064 -7.881617 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -841.5353 7 1697.071 1738.625 ------------------------------------------------------------------------------ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- art_root9 | 2797 1.14128 .3025719 0 1.610779 Logistic regression Number of obs = 2797 LR chi2(6) = 398.76 Prob > chi2 = 0.0000 Log likelihood = -843.45036 Pseudo R2 = 0.1912 ------------------------------------------------------------------------------ tenure | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- art_root9 | 2.463891 .5259642 4.68 0.000 1.43302 3.494762 female | -.3162621 .1309963 -2.41 0.016 -.5730101 -.0595141 year | 1.702306 .163918 10.39 0.000 1.381032 2.023579 yearsq | -.1229755 .0140635 -8.74 0.000 -.1505394 -.0954116 select | .1438176 .0454046 3.17 0.002 .0548263 .232809 prestige | -.3657016 .08585 -4.26 0.000 -.5339646 -.1974386 _cons | -9.38709 .766672 -12.24 0.000 -10.88974 -7.884441 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- . | 2797 -1042.828 -843.4504 7 1700.901 1742.455 ------------------------------------------------------------------------------ . . // #5 . // create variables from matrix . . matrix list stats stats[9,5] root sd b_std exp_b_std bic art_root1 1 6.5756817 .3605121 1.4340636 1732.62 art_root2 2 1.1712688 .50743656 1.6610278 1716.2029 art_root3 3 .64592671 .58131159 1.7883825 1716.7908 art_root4 4 .47927782 .637645 1.8920199 1720.6052 art_root5 5 .40253414 .68263389 1.9790836 1725.2236 art_root6 6 .35988482 .71623804 2.046719 1729.9242 art_root7 7 .33329297 .73789942 2.0915375 1734.4307 art_root8 8 .31536454 .7474987 2.1117114 1738.6247 art_root9 9 .30257187 .74550408 2.1075035 1742.4548 . svmat stats, names(col) . . // #6 . // plot results . . twoway (connected bic root, msymbol(circle)), /// > ytitle(BIC statistic) ylabel(1700(10)1750) /// > xtitle(Root transformation of articles) xlabel(1(1)9) /// > caption("wf7-matrix-plot.do 2008-10-24",size(vsmall)) . graph export wf7-matrix-plot.eps, replace (file wf7-matrix-plot.eps written in EPS format) . . log close log: D:\wf\work\wf7-matrix-plot.log log type: text closed on: 24 Oct 2008, 09:41:54 -------------------------------------------------------------------------------- . exit end of do-file . . * include files . do wf7-include-sample.do . capture log close . log using wf7-include-sample, replace text (note: file D:\wf\work\wf7-include-sample.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-include-sample.log log type: text opened on: 24 Oct 2008, 09:41:54 . . // program: wf7-include-sample.do \ for stata 9 . // include: requires wf7-include-sample.doi . // task: using include to select a sample . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . drop if year>=11 // drop cases with long time in rank (148 observations deleted) . drop if prestige<1 // drop if unrated department (10 observations deleted) . . // #2 . // compute descriptives . . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- sdscid | 2787 7642.001 4591.449 2001 18632 tenure | 2787 .1230714 .3285781 0 1 female | 2787 .3781844 .485021 0 1 year | 2787 3.860065 2.303702 1 10 yearsq | 2787 20.20524 22.17226 1 100 -------------+-------------------------------------------------------- select | 2787 4.997901 1.406436 1 7 articles | 2787 7.062433 6.582873 0 73 prestige | 2787 2.653339 .7700899 1 4.8 presthi | 2787 .0470039 .2116853 0 1 male | 2787 .6218156 .485021 0 1 -------------+-------------------------------------------------------- sampleis | 2787 1 0 1 1 . . // #3 . // load data and select sample with include file . . include wf7-include-sample.doi . // include: wf7-include-sample.doi . // used by: wf7-include-sample.do \ for stata 9 . // task: define sample for tenure example . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . drop if year>=11 // drop cases with long time in rank (148 observations deleted) . drop if prestige<1 // drop if unrated department (10 observations deleted) . . . // #4 . // compute descriptives . . summarize Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- sdscid | 2787 7642.001 4591.449 2001 18632 tenure | 2787 .1230714 .3285781 0 1 female | 2787 .3781844 .485021 0 1 year | 2787 3.860065 2.303702 1 10 yearsq | 2787 20.20524 22.17226 1 100 -------------+-------------------------------------------------------- select | 2787 4.997901 1.406436 1 7 articles | 2787 7.062433 6.582873 0 73 prestige | 2787 2.653339 .7700899 1 4.8 presthi | 2787 .0470039 .2116853 0 1 male | 2787 .6218156 .485021 0 1 -------------+-------------------------------------------------------- sampleis | 2787 1 0 1 1 . . log close log: D:\wf\work\wf7-include-sample.log log type: text closed on: 24 Oct 2008, 09:41:54 -------------------------------------------------------------------------------- . exit end of do-file . . * baseline statistics . do wf7-baseline.do . capture log close . log using wf7-baseline, replace text (note: file D:\wf\work\wf7-baseline.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-baseline.log log type: text opened on: 24 Oct 2008, 09:41:54 . . // pgm: wf7-baseline.do \ for stata 9 . // task: tenure - baseline statistics . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . tabulate sampleis Sample for | tenure | analysis | Freq. Percent Cum. ------------+----------------------------------- 0_Not | 148 5.03 5.03 1_InSample | 2,797 94.97 100.00 ------------+----------------------------------- Total | 2,945 100.00 . keep if sampleis (148 observations deleted) . . // #2 . // desc statistics for men & women combined . . codebook female male tenure year yearsq select articles prestige, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 2797 2 .3775474 0 1 Scientist is female? male 2797 2 .6224526 0 1 Is male? tenure 2797 2 .1229889 0 1 Is tenured? year 2797 10 3.855917 1 10 Years in rank yearsq 2797 10 20.16911 1 100 Years in rank squared select 2797 8 4.995048 1 7 Baccalaureate selectivity articles 2797 48 7.050411 0 73 Total number of articles prestige 2797 98 2.646591 .65 4.8 Prestige of department -------------------------------------------------------------------------------- . . // #3 . // desc statistics for women . . codebook female male tenure year yearsq select articles prestige /// > if female, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1056 1 1 1 1 Scientist is female? male 1056 1 0 0 0 Is male? tenure 1056 2 .1089015 0 1 Is tenured? year 1056 10 3.974432 1 10 Years in rank yearsq 1056 10 21.45739 1 100 Years in rank squared select 1056 8 5.000852 1 7 Baccalaureate selectivity articles 1056 44 7.414773 0 73 Total number of articles prestige 1056 71 2.658144 .9 4.8 Prestige of department -------------------------------------------------------------------------------- . . // #4 . // desc statistics for men . . codebook female male tenure year yearsq select articles prestige /// > if male, compact Variable Obs Unique Mean Min Max Label -------------------------------------------------------------------------------- female 1741 1 0 0 0 Scientist is female? male 1741 1 1 1 1 Is male? tenure 1741 2 .1315336 0 1 Is tenured? year 1741 10 3.784032 1 10 Years in rank yearsq 1741 10 19.38771 1 100 Years in rank squared select 1741 8 4.991528 1 7 Baccalaureate selectivity articles 1741 37 6.829408 0 49 Total number of articles prestige 1741 81 2.639584 .65 4.64 Prestige of department -------------------------------------------------------------------------------- . . log close log: D:\wf\work\wf7-baseline.log log type: text closed on: 24 Oct 2008, 09:41:55 -------------------------------------------------------------------------------- . exit end of do-file . . * replicatoin . do wf7-replicate-bootstrap.do . capture log close . log using wf7-replicate-bootstrap, replace text (note: file D:\wf\work\wf7-replicate-bootstrap.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-replicate-bootstrap.log log type: text opened on: 24 Oct 2008, 09:41:55 . . // pgm: wf7-replicate-bootstrap.do \ for stata 9 . // task: replication and the random number seed . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and estimate the model . . use wf-lfp, clear (Workflow data on labor force participation \ 2008-04-02) . * in stata 10 and later: datasignature confirm . logit lfp k5 k618 age wc hc lwg inc Iteration 0: log likelihood = -514.8732 Iteration 1: log likelihood = -454.32339 Iteration 2: log likelihood = -452.64187 Iteration 3: log likelihood = -452.63296 Iteration 4: log likelihood = -452.63296 Logistic regression Number of obs = 753 LR chi2(7) = 124.48 Prob > chi2 = 0.0000 Log likelihood = -452.63296 Pseudo R2 = 0.1209 ------------------------------------------------------------------------------ lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- k5 | -1.462913 .1970006 -7.43 0.000 -1.849027 -1.076799 k618 | -.0645707 .0680008 -0.95 0.342 -.1978499 .0687085 age | -.0628706 .0127831 -4.92 0.000 -.0879249 -.0378162 wc | .8072738 .2299799 3.51 0.000 .3565215 1.258026 hc | .1117336 .2060397 0.54 0.588 -.2920969 .515564 lwg | .6046931 .1508176 4.01 0.000 .3090961 .9002901 inc | -.0344464 .0082084 -4.20 0.000 -.0505346 -.0183583 _cons | 3.18214 .6443751 4.94 0.000 1.919188 4.445092 ------------------------------------------------------------------------------ . . // #2 . // bootstrap CI for prediction with seed 11020 . // note: 100 reps is used for purposes to illustrate a point. . // For real world applications, 1000 reps is needed! . . set seed 11020 . prvalue, boot reps(100) logit: Predictions for lfp Bootstrap confidence intervals using percentile method (100 of 100 replications completed) 95% Conf. Interval Pr(y=1_InLF|x): 0.5778 [ 0.5242, 0.6110] Pr(y=0_NotInL|x): 0.4222 [ 0.3890, 0.4758] k5 k618 age wc hc lwg inc x= .2377158 1.3532537 42.537849 .2815405 .39176627 1.0971148 20.128965 . . * run prvalue again, without setting the seed . prvalue, boot reps(100) logit: Predictions for lfp Bootstrap confidence intervals using percentile method (100 of 100 replications completed) 95% Conf. Interval Pr(y=1_InLF|x): 0.5778 [ 0.5361, 0.6318] Pr(y=0_NotInL|x): 0.4222 [ 0.3682, 0.4639] k5 k618 age wc hc lwg inc x= .2377158 1.3532537 42.537849 .2815405 .39176627 1.0971148 20.128965 . . // #3 . // bootstrap CI for prediction with seed 1121212 . // note: 100 reps is used for purposes to illustrate a point. . // For real world applications, 1000 reps is needed! . . set seed 1121212 . prvalue, boot reps(100) logit: Predictions for lfp Bootstrap confidence intervals using percentile method (100 of 100 replications completed) 95% Conf. Interval Pr(y=1_InLF|x): 0.5778 [ 0.5429, 0.6183] Pr(y=0_NotInL|x): 0.4222 [ 0.3817, 0.4571] k5 k618 age wc hc lwg inc x= .2377158 1.3532537 42.537849 .2815405 .39176627 1.0971148 20.128965 . . // #4 . // bootstrap CI for prediction with seed 1121212 . // same seed, same results . // note: 100 reps is used for purposes to illustrate a point. . // For real world applications, 1000 reps is needed! . . prvalue, boot reps(100) logit: Predictions for lfp Bootstrap confidence intervals using percentile method (100 of 100 replications completed) 95% Conf. Interval Pr(y=1_InLF|x): 0.5778 [ 0.5436, 0.6203] Pr(y=0_NotInL|x): 0.4222 [ 0.3797, 0.4564] k5 k618 age wc hc lwg inc x= .2377158 1.3532537 42.537849 .2815405 .39176627 1.0971148 20.128965 . . // #5 . // with 1000 replications, results are much more similar but it . // takes ten times longer to run . /* > set seed 11020 > prvalue, boot reps(1000) > set seed 1121212 > prvalue, boot reps(1000) > */ . . log close log: D:\wf\work\wf7-replicate-bootstrap.log log type: text closed on: 24 Oct 2008, 09:42:06 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-replicate-setseed.do . capture log close . log using wf7-replicate-setseed, replace text (note: file D:\wf\work\wf7-replicate-setseed.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-replicate-setseed.log log type: text opened on: 24 Oct 2008, 09:42:06 . . // pgm: wf7-replicate-setseed.do \ for stata 9 . // task: letting stata set the seed . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // note: must be run immediately after starting Stata . // uniform() was replaced by runiform() in Stata 10.1 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // set # of observations . . set obs 100 obs was 0, now 100 . . // #2 . // check the seed stata automatically set . . creturn list System values ---------------------------------------------------------------------------- c(current_date) = "24 Oct 2008" c(current_time) = "09:42:06" c(rmsg_time) = 0 (seconds, from set rmsg) ---------------------------------------------------------------------------- c(stata_version) = 9.2 c(version) = 9.2 (version) ---------------------------------------------------------------------------- c(born_date) = "20 Jul 2007" c(flavor) = "Intercooled" c(SE) = 1 c(MP) = 0 c(mode) = "" c(console) = "" ---------------------------------------------------------------------------- c(os) = "Windows" c(osdtl) = "" c(machine_type) = "PC" c(byteorder) = "lohi" ---------------------------------------------------------------------------- Directories and paths ---------------------------------------------------------------------------- c(sysdir_stata) = "D:\Stata9/" (sysdir) c(sysdir_updates) = "D:\Stata9\ado\upd.." (sysdir) c(sysdir_base) = "D:\Stata9\ado\base/" (sysdir) c(sysdir_site) = "D:\Stata9\ado\site/" (sysdir) c(sysdir_plus) = "c:\ado\plus/" (sysdir) c(sysdir_personal) = "c:\ado\personal/" (sysdir) c(sysdir_oldplace) = "c:\ado/" (sysdir) ---------------------------------------------------------------------------- c(adopath) = "UPDATES;BASE;SITE.." (adopath) c(pwd) = "D:\wf\work" (cd) c(dirsep) = "/" ---------------------------------------------------------------------------- System limits ---------------------------------------------------------------------------- c(max_N_theory) = 2147483647 c(max_k_theory) = 5000 (set maxvar) c(max_width_theory) = 393204 (set maxvar) ---------------------------------------------------------------------------- c(max_N_current) = 10485756 (set memory) c(max_k_current) = 5000 (set memory) c(max_width_current) = 60000 (set memory) ---------------------------------------------------------------------------- c(max_matsize) = 11000 c(min_matsize) = 10 ---------------------------------------------------------------------------- c(max_macrolen) = 1081511 c(macrolen) = 165200 (set maxvar) c(max_cmdlen) = 1081527 c(cmdlen) = 165216 (set maxvar) c(namelen) = 32 ---------------------------------------------------------------------------- Numerical and string limits ---------------------------------------------------------------------------- c(mindouble) = -8.9884656743e+307 c(maxdouble) = 8.9884656743e+307 c(epsdouble) = 2.22044604925e-16 ---------------------------------------------------------------------------- c(minfloat) = -1.70141173319e+38 c(maxfloat) = 1.70141173319e+38 c(epsfloat) = 1.19209289551e-07 ---------------------------------------------------------------------------- c(minlong) = -2147483647 c(maxlong) = 2147483620 ---------------------------------------------------------------------------- c(minint) = -32767 c(maxint) = 32740 ---------------------------------------------------------------------------- c(minbyte) = -127 c(maxbyte) = 100 ---------------------------------------------------------------------------- c(maxstrvarlen) = 244 ---------------------------------------------------------------------------- Current dataset ---------------------------------------------------------------------------- c(N) = 100 c(k) = 0 c(width) = 0 c(changed) = 0 c(filename) = "" c(filedate) = "" ---------------------------------------------------------------------------- Memory settings ---------------------------------------------------------------------------- c(memory) = 104857600 (set memory) c(maxvar) = 5000 (set maxvar) c(matsize) = 400 (set matsize) ---------------------------------------------------------------------------- Output settings ---------------------------------------------------------------------------- c(more) = "off" (set more) c(rmsg) = "off" (set rmsg) c(dp) = "period" (set dp) c(linesize) = 80 (set linesize) c(pagesize) = 44 (set pagesize) c(logtype) = "text" (set logtype) ---------------------------------------------------------------------------- Interface settings ---------------------------------------------------------------------------- c(dockable) = "on" (set dockable) c(dockingguides) = "on" (set dockingguides) c(locksplitters) = "off" (set locksplitters) c(persistfv) = "off" (set persistfv) c(persistvtopic) = "off" (set persistvtopic) c(reventries) = 100 (set reventries) c(xptheme) = "off" (set xptheme) c(doublebuffer) = "on" (set doublebuffer) c(linegap) = 1 (set linegap) c(scrollbufsize) = 32000 (set scrollbufsize) c(varlabelpos) = 32 (set varlabelpos) c(maxdb) = 50 (set maxdb) ---------------------------------------------------------------------------- Graphics settings ---------------------------------------------------------------------------- c(graphics) = "on" (set graphics) c(scheme) = "s2manual" (set scheme) c(printcolor) = "automatic" (set printcolor) c(copycolor) = "automatic" (set copycolor) ---------------------------------------------------------------------------- Efficiency settings ---------------------------------------------------------------------------- c(adosize) = 500 (set adosize) c(virtual) = "off" (set virtual) ---------------------------------------------------------------------------- Network settings ---------------------------------------------------------------------------- c(checksum) = "off" (set checksum) c(timeout1) = 120 (set timeout1) c(timeout2) = 300 (set timeout2) ---------------------------------------------------------------------------- c(httpproxy) = "off" (set httpproxy) c(httpproxyhost) = "" (set httpproxyhost) c(httpproxyport) = 80 (set httpproxyport) ---------------------------------------------------------------------------- c(httpproxyauth) = "off" (set httpproxyauth) c(httpproxyuser) = "" (set httpproxyuser) c(httpproxypw) = "" (set httpproxypw) ---------------------------------------------------------------------------- Update settings ---------------------------------------------------------------------------- c(update_query) = "on" (set update_query) c(update_interval) = 7 (set update_interval) c(update_prompt) = "on" (set update_prompt) ---------------------------------------------------------------------------- Trace (program debugging) settings ---------------------------------------------------------------------------- c(trace) = "off" (set trace) c(tracedepth) = 32000 (set tracedepth) c(tracesep) = "on" (set tracesep) c(traceindent) = "on" (set traceindent) c(traceexpand) = "on" (set traceexpand) c(tracenumber) = "off" (set tracenumber) c(tracehilite) = "" (set tracehilite) ---------------------------------------------------------------------------- Mata settings ---------------------------------------------------------------------------- c(matastrict) = "off" (set matastrict) c(matalnum) = "off" (set matalnum) c(mataoptimize) = "on" (set mataoptimize) c(matafavor) = "space" (set matafavor) c(matacache) = 400 (set matacache) c(matalibs) = "lmatabase;lmataado" (set matalibs) c(matamofirst) = "off" (set matamofirst) ---------------------------------------------------------------------------- Other settings ---------------------------------------------------------------------------- c(type) = "float" (set type) c(level) = 95 (set level) c(maxiter) = 16000 (set maxiter) c(searchdefault) = "local" (set searchdefault) c(seed) = "X3ba2814009fefac1.." (set seed) c(varabbrev) = "on" (set varabbrev) ---------------------------------------------------------------------------- Other ---------------------------------------------------------------------------- c(pi) = 3.141592653589793 c(alpha) = "a b c d e f g h i.." c(ALPHA) = "A B C D E F G H I.." c(Mons) = "Jan Feb Mar Apr M.." c(Months) = "January February .." c(Wdays) = "Sun Mon Tue Wed T.." c(Weekdays) = "Sunday Monday Tue.." c(rc) = 606 (capture) ---------------------------------------------------------------------------- . local seedis = c(seed) . di "`seedis'" X3ba2814009fefac13ae773998d7b76f90061 . . // #3 . // generate random numbers with this seed . . gen u1 = uniform() // renamed to runiform() in stata 10.1 . sum u1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- u1 | 100 .5639921 .3056 .0111838 .9872538 . . // #4 . // generate random numbers based on seed set by stata initially . . gen u2 = uniform() // renamed to runiform() in stata 10.1 . sum u2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- u2 | 100 .4613564 .2933409 .0053628 .9983065 . . // #5 . // generate random numbers with a seed I pick . . set seed 1102 . gen u3 = uniform() // renamed to runiform() in stata 10.1 . sum u3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- u3 | 100 .510491 .2671904 .0150277 .9548866 . . // #6 . // generate random numbers with seed saved above . . set seed `seedis' . gen u4 = uniform() // renamed to runiform() in stata 10.1 . pwcorr u1 u2 u3 u3 u4 | u1 u2 u3 u3 u4 -------------+--------------------------------------------- u1 | 1.0000 u2 | 0.0567 1.0000 u3 | -0.0267 0.1003 1.0000 u3 | -0.0267 0.1003 1.0000 1.0000 u4 | 1.0000 0.0567 -0.0267 -0.0267 1.0000 . . log close log: D:\wf\work\wf7-replicate-setseed.log log type: text closed on: 24 Oct 2008, 09:42:06 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-replicate-stepwise.do . capture log close . log using wf7-replicate-stepwise, replace text (note: file D:\wf\work\wf7-replicate-stepwise.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-replicate-stepwise.log log type: text opened on: 24 Oct 2008, 09:42:06 . . // pgm: wf7-replicate-stepwise.do \ for stata 9 . // task: effect of seed on results when using a training and . // confirmation sample . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // note: uniform() was replaced by uniform() in Stata 10.1 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data . . use wf-articles, clear (Workflow data on scientific productivity \ 2008-04-11) . * in stata 10 and later: datasignature confirm . . // #2 . // random selection 1: randomly select half of the cases . . set seed X57c74068e0f7a3200d5b8463f279bb82065a . generate train1 = (uniform() < .5) // renamed to runiform() in stata 10.1 . label var train1 "Training sample?" . label def trainlbl 0 "0Confirm" 1 "1Train" . label val train1 trainlbl . . * full model with EXPLORATION sample . quietly nbreg art fem mar kid5 phd ment if train1==1 . estimates store train1full . . * trim model with stepwise procedures with EXPLORATION sample . quietly stepwise, pr(.05): nbreg art fem mar kid5 phd ment if train1==1 . estimates store train1trim . . * estimate trimmed model with CONFIRMATION sample . quietly nbreg art fem kid5 ment if train1==0 . estimates store confirm1trim . . * estimate full model with CONFIRMATION sample . quietly nbreg art fem mar kid5 phd ment if train1==0 . estimates store confirm1full . . * Compare results from EXPLORATION AND CONFIRMATION SAMPLES . * They match quite well. . estimates table train1trim confirm1trim, /// > stats(N chi2) b(%9.3f) star -------------------------------------------- Variable | train1trim confirm1trim -------------+------------------------------ art | fem | -0.207* -0.249* ment | 0.019*** 0.038*** kid5 | -0.138* -0.138* _cons | 0.438*** 0.340*** -------------+------------------------------ lnalpha | _cons | -1.014*** -0.723*** -------------+------------------------------ Statistics | N | 478.000 437.000 chi2 | 24.417 71.336 -------------------------------------------- legend: * p<0.05; ** p<0.01; *** p<0.001 . estimates table train1full confirm1full train1trim confirm1trim, /// > stats(N chi2) b(%9.3f) star -------------------------------------------------------------------------- Variable | train1full confirm1full train1trim confirm1trim -------------+------------------------------------------------------------ art | fem | -0.201* -0.226* -0.207* -0.249* mar | 0.068 0.211 kid5 | -0.162* -0.188* -0.138* -0.138* phd | -0.031 0.069 ment | 0.020*** 0.037*** 0.019*** 0.038*** _cons | 0.489** 0.009 0.438*** 0.340*** -------------+------------------------------------------------------------ lnalpha | _cons | -1.016*** -0.751*** -1.014*** -0.723*** -------------+------------------------------------------------------------ Statistics | N | 478.000 437.000 478.000 437.000 chi2 | 25.282 75.784 24.417 71.336 -------------------------------------------------------------------------- legend: * p<0.05; ** p<0.01; *** p<0.001 . . // #3 . // random selection 2: randomly select half of the cases . . set seed 11051951 . generate train2 = (uniform() < .5) // renamed to runiform() in stata 10.1 . label var train2 "Training sample?" . label val train2 trainlbl . . * full model with EXPLORATION sample with EXPLORATION sample . quietly nbreg art fem mar kid5 phd ment if train2==1 . estimates store train2full . . * trim model with stepwise procedures with EXPLORATION sample . quietly stepwise, pr(.05): nbreg art fem mar kid5 phd ment /// > if train2==1 . estimates store train2trim . . * estimate trimmed model with CONFIRMATION sample . quietly nbreg art fem mar kid5 ment if train2==0 . estimates store confirm2trim . . * estimate full model with CONFIRMATION sample . quietly nbreg art fem mar kid5 phd ment if train2==0 . estimates store confirm2full . . * Compare results from EXPLORATION AND CONFIRMATION SAMPLES . * They match poorly. . estimates table train2trim confirm2trim, /// > stats(N chi2) b(%9.3f) star -------------------------------------------- Variable | train2trim confirm2trim -------------+------------------------------ art | fem | -0.304** -0.132 mar | 0.273* 0.015 kid5 | -0.211** -0.130 ment | 0.033*** 0.024*** _cons | 0.259* 0.361** -------------+------------------------------ lnalpha | _cons | -0.722*** -1.001*** -------------+------------------------------ Statistics | N | 456.000 459.000 chi2 | 69.155 29.522 -------------------------------------------- legend: * p<0.05; ** p<0.01; *** p<0.001 . estimates table train2full confirm2full train2trim confirm2trim, /// > stats(N chi2) b(%9.3f) star -------------------------------------------------------------------------- Variable | train2full confirm2full train2trim confirm2trim -------------+------------------------------------------------------------ art | fem | -0.304** -0.132 -0.304** -0.132 mar | 0.275* 0.024 0.273* 0.015 kid5 | -0.211** -0.128 -0.211** -0.130 phd | 0.008 0.037 ment | 0.033*** 0.023*** 0.033*** 0.024*** _cons | 0.235 0.246 0.259* 0.361** -------------+------------------------------------------------------------ lnalpha | _cons | -0.722*** -1.009*** -0.722*** -1.001*** -------------+------------------------------------------------------------ Statistics | N | 456.000 459.000 456.000 459.000 chi2 | 69.177 30.064 69.155 29.522 -------------------------------------------------------------------------- legend: * p<0.05; ** p<0.01; *** p<0.001 . . log close log: D:\wf\work\wf7-replicate-stepwise.log log type: text closed on: 24 Oct 2008, 09:42:09 -------------------------------------------------------------------------------- . exit end of do-file . . * presenting results . do wf7-tables-esttab.do . capture log close . log using wf7-tables-esttab, replace text (note: file D:\wf\work\wf7-tables-esttab.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-tables-esttab.log log type: text opened on: 24 Oct 2008, 09:42:09 . . // pgm: wf7-tables-esttab.do \ for stata 9 . // task: using eststo and esttab to format tables . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . . // #1 . // load data and select sample . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . keep if sampleis (148 observations deleted) . . // #2 . // define groups of variables . . local Vtime "year yearsq" // time in rank . local Vdept "select prestige" // characteristics of department affiliations . local Vprod "articles" // research productivity . . // #3 . // nested models predicting tenure . . // #3a - baseline gender only model . . logit tenure female, nolog or Logistic regression Number of obs = 2797 LR chi2(1) = 3.17 Prob > chi2 = 0.0752 Log likelihood = -1041.2452 Pseudo R2 = 0.0015 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .8069089 .0981196 -1.76 0.078 .6357978 1.024071 ------------------------------------------------------------------------------ . eststo (est1 stored) . . // #3b + time . . logit tenure female `Vtime', nolog or Logistic regression Number of obs = 2797 LR chi2(3) = 348.73 Prob > chi2 = 0.0000 Log likelihood = -868.46481 Pseudo R2 = 0.1672 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7231118 .0933734 -2.51 0.012 .5614256 .9313624 year | 6.079157 .9791881 11.21 0.000 4.433409 8.335832 yearsq | .8785368 .0121689 -9.35 0.000 .8550071 .9027141 ------------------------------------------------------------------------------ . eststo (est2 stored) . . // #3c + department . . logit tenure female `Vtime' `Vdept', nolog or Logistic regression Number of obs = 2797 LR chi2(5) = 365.86 Prob > chi2 = 0.0000 Log likelihood = -859.89742 Pseudo R2 = 0.1754 ------------------------------------------------------------------------------ tenure | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .7208627 .0936383 -2.52 0.012 .5588349 .9298686 year | 6.161345 .9978713 11.23 0.000 4.485571 8.463176 yearsq | .8778896 .0122234 -9.35 0.000 .8542561 .902177 select | 1.151231 .0519671 3.12 0.002 1.053753 1.257726 prestige | .7697678 .0639568 -3.15 0.002 .6540891 .9059049 ------------------------------------------------------------------------------ . eststo (est3 stored) . . // #4 . // esttab options . . // #4a - default . . esttab ------------------------------------------------------------ (1) (2) (3) tenure tenure tenure ------------------------------------------------------------ female -0.215 -0.324* -0.327* (-1.76) (-2.51) (-2.52) year 1.805*** 1.818*** (11.21) (11.23) yearsq -0.129*** -0.130*** (-9.35) (-9.35) select 0.141** (3.12) prestige -0.262** (-3.15) _cons -1.887*** -6.927*** -7.002*** (-26.62) (-15.59) (-13.30) ------------------------------------------------------------ N 2797 2797 2797 ------------------------------------------------------------ t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 . . // #4b - near final table . . esttab, eform nostar bic label varwidth(33) /// > title("Table 7.1: Workflow Example of Jann's esttab Command.") /// > mtitles("Model A" "Model B" "Model C") /// > addnote("Source: wf7-tables-esttab.do") Table 7.1: Workflow Example of Jann's esttab Command. ------------------------------------------------------------------------ (1) (2) (3) Model A Model B Model C ------------------------------------------------------------------------ Scientist is female? 0.807 0.723 0.721 (-1.76) (-2.51) (-2.52) Years in rank 6.079 6.161 (11.21) (11.23) Years in rank squared 0.879 0.878 (-9.35) (-9.35) Baccalaureate selectivity 1.151 (3.12) Prestige of department 0.770 (-3.15) ------------------------------------------------------------------------ Observations 2797 2797 2797 BIC 2098.4 1768.7 1767.4 ------------------------------------------------------------------------ Exponentiated coefficients; t statistics in parentheses Source: wf7-tables-esttab.do . . // #4c - for latex . . esttab using wf7-estout.tex, eform nostar bic label varwidth(33) /// > mtitles("Model A" "Model B" "Model C") /// > addnote("Source: wf7-tables-esttab.do") replace (note: file wf7-estout.tex not found) (output written to wf7-estout.tex) . . log close log: D:\wf\work\wf7-tables-esttab.log log type: text closed on: 24 Oct 2008, 09:42:09 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-graphs-colors.do . capture log close . log using wf7-graphs-colors, replace text (note: file D:\wf\work\wf7-graphs-colors.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-graphs-colors.log log type: text opened on: 24 Oct 2008, 09:42:09 . . // pgm: wf7-graphs-colors.do . // task: colors that look the same in B&W . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . set scheme s2manual . . // #1 . // load data . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . sort male . by male: sum tenure -------------------------------------------------------------------------------- -> male = 0_Female Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1121 .1079393 .3104423 0 1 -------------------------------------------------------------------------------- -> male = 1_Male Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1824 .1326754 .3393167 0 1 . . // #2 . // get mean data to plot . . * set up variables . gen Mbg = . (2945 missing values generated) . label var Mbg "Men" . gen Wbg = . (2945 missing values generated) . label var Wbg "Women" . gen Vbg = . (2945 missing values generated) . label var Vbg "Variable" . label def Vbg 0 "Not Distinguished" 1 "Distinguished" . label val Vbg Vbg . . * tenure rates for men / high prestige . sum tenure if male==1 & presthi==1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 72 .0972222 .2983392 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Mbg = `mn' in 2 (1 real change made) . replace Vbg = 1 in 2 (1 real change made) . . * tenure rates for men / low prestige . sum tenure if male==1 & presthi==0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1752 .1341324 .3408918 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Mbg = `mn' in 1 (1 real change made) . replace Vbg = 0 in 1 (1 real change made) . . * tenure rates for women / high prestige . sum tenure if male==0 & presthi==1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 61 .0655738 .2495898 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Wbg = `mn' in 2 (1 real change made) . replace Vbg = 1 in 2 (0 real changes made) . . * tenure rates for women / low prestige . sum tenure if male==0 & presthi==0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1060 .1103774 .3135074 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Wbg = `mn' in 1 (1 real change made) . replace Vbg = 0 in 1 (0 real changes made) . . // #3 . // green and red bars . . graph bar (mean) Mbg (mean) Wbg, over(Vbg) /// > legend(label(1 Men) label(2 Women)) ytitle("Percent Tenured") /// > ylabel(0(3)15) legend(label(1 Men) label(2 Women)) /// > bar(1,fcolor(red)) bar(2,fcolor(green)) . graph export wf7-graphs-colors.eps, replace (file wf7-graphs-colors.eps written in EPS format) . . log close log: D:\wf\work\wf7-graphs-colors.log log type: text closed on: 24 Oct 2008, 09:42:11 -------------------------------------------------------------------------------- . exit end of do-file . do wf7-graphs-fontsize.do . capture log close . log using wf7-graphs-fontsize, replace text (note: file D:\wf\work\wf7-graphs-fontsize.log not found) -------------------------------------------------------------------------------- log: D:\wf\work\wf7-graphs-fontsize.log log type: text opened on: 24 Oct 2008, 09:42:11 . . // pgm: wf7-graphs-fontsize.do \ for stata 9 . // task: fontsize when graphs are resized . // project: workflow chapter 7 . // author: scott long \ 2008-10-24 . . // #0 . // setup . . version 9.2 . set linesize 80 . clear // changed to clear all in stata 10 . macro drop _all . set scheme s2manual . . // #1 . // load data . . use wf-tenure, clear (Workflow data for gender differences in tenure \ 2008-04-02) . * in stata 10 and later: datasignature confirm . sort male . by male: sum tenure -------------------------------------------------------------------------------- -> male = 0_Female Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1121 .1079393 .3104423 0 1 -------------------------------------------------------------------------------- -> male = 1_Male Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1824 .1326754 .3393167 0 1 . . // #2 . // get mean data to plot . . * set up variables . gen Mbg = . (2945 missing values generated) . label var Mbg "Men" . gen Wbg = . (2945 missing values generated) . label var Wbg "Women" . gen Vbg = . (2945 missing values generated) . label var Vbg "Variable" . label def Vbg 0 "Not Distinguished" 1 "Distinguished" . label val Vbg Vbg . . * tenure rates for men / high prestige . sum tenure if male==1 & presthi==1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 72 .0972222 .2983392 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Mbg = `mn' in 2 (1 real change made) . replace Vbg = 1 in 2 (1 real change made) . . * tenure rates for men / low prestige . sum tenure if male==1 & presthi==0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1752 .1341324 .3408918 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Mbg = `mn' in 1 (1 real change made) . replace Vbg = 0 in 1 (1 real change made) . . * tenure rates for women / high prestige . sum tenure if male==0 & presthi==1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 61 .0655738 .2495898 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Wbg = `mn' in 2 (1 real change made) . replace Vbg = 1 in 2 (0 real changes made) . . * tenure rates for women / low prestige . sum tenure if male==0 & presthi==0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- tenure | 1060 .1103774 .3135074 0 1 . matrix mn = r(mean) . local mn = mn[1,1]*100 . replace Wbg = `mn' in 1 (1 real change made) . replace Vbg = 0 in 1 (0 real changes made) . . // #3 . // bar chart - default font size . . set scheme s2manual . graph bar (mean) Mbg (mean) Wbg, over(Vbg) /// > legend(label(1 Men) label(2 Women)) ytitle("Percent Tenured") /// > ylabel(0(3)15) legend(label(1 Men) label(2 Women)) /// > bar(1,fcolor(gs4)) bar(2,fcolor(gs13)) . graph export wf7-graphs-fontsize-default.eps, replace (file wf7-graphs-fontsize-default.eps written in EPS format) . . // #3 . // bar chart - larger font size . . graph bar (mean) Mbg (mean) Wbg, over(Vbg, label(labsize(vlarge))) /// > legend(label(1 Men) label(2 Women)) ytitle("Percent Tenured", size(vlarge) > ) /// > ylabel(0(3)15, labsize(large)) legend(label(1 Men) label(2 Women)) /// > bar(1,fcolor(gs4)) bar(2,fcolor(gs13)) . graph export wf7-graphs-fontsize-vlarge.eps, replace (file wf7-graphs-fontsize-vlarge.eps written in EPS format) . . log close log: D:\wf\work\wf7-graphs-fontsize.log log type: text closed on: 24 Oct 2008, 09:42:13 -------------------------------------------------------------------------------- . exit end of do-file . . log close master log: D:\wf\work\wf7.log log type: text closed on: 24 Oct 2008, 09:42:13 -------------------------------------------------------------------------------- . exit end of do-file . . log close wfall log: D:\wf\work\wf-all.log log type: text closed on: 24 Oct 2008, 09:42:13 --------------------------------------------------------------------------------