The Survey of Youth in Custody sampled juveniles and young adults in long-term, state-operated juvenile institutions. Residents of facilities were interviewed about family background, previous criminal history,and drug and alcohol use.

The facilities form a natural cluster unit for an in-person survey; the sampling frame of 206 facilities was constructed from the Children in Custody (CIC) Census. The psus (facilities) were divided into 16 strata by number of residents. Each of the 11 facilities with 360 or more youth formed its own stratum (strata 6–16); each of these facilities was included in the sample and residents of the 11 facilities were subsampled. In strata 1–5, facilities were sampled with probability proportional to size from the 195 remaining facilities; residents were subsampled with predetermined sampling fractions.

The stratum boundaries were chosen so that the number of residents in each stratum would be comparable. It was originally intended that each resident have probability 1/8 of inclusion in the sample, which would result in a self-weighting sample with constant weight 8. The facilities in strata 14 and 16, however, had experienced a great deal of growth between the year of Census and the year of the survey, so the sampling fractions in those strata were changed to 1/11 and 1/12, respectively. In strata 1–5, weights varied from about 5 to about 15, depending on the facility’s inclusion probability and the predetermined sampling fraction in that facility. The weights were further adjusted for nonresponse, and to match the sample counts with the census count of youths in long-term, state-operated facilities. After all weighting adjustments were made, weights ranged from 5 (in stratum 4) to 50 (for some youths in states that required parental permission and hence had lower response rates).

Selected variables from the survey are in the file syc.csv.

  1. The weights are in variable“finalwt”.
  2. The strata are in variable”stratum”.
  3. The facilities (psus) are in variable ”facility”. There is only one facility in each of strata 6–16, so that a stratified random sample of individuals is taken in each of those strata. In the SAS program, please define the psus for those strata to be individuals rather than the facility so that they contribute to the standarderrors of the estimates. The following SAS program can be used to create the new ”psu” variable using the variable ”facility”. This ”psu” variable should be used for all analyses instead of the variable ”facility”.

psu = facility;

if stratum ge then psu = _N_;

  • Study variables are described in the table below.

Name of


stratum stratum number
facility facility number
facsize number of eligible residents in facility
finalwt final weight
age age of resident (99=missing)

race of resident;

1 = white; 2 = black; 3 = Asian/Pacific Islander;

4 = American Indian, Aleut, Eskimo; 5 = Other;9 = Missing

ethnicty 1 = Hispanic, 2 = not Hispanic, 9=missing
sex 1 = male, 2 = female, 9 = missing

Who did you live with most of the time you were growing up? 1 = Motheronly, 2 = Father only 3 = Both mother and father, 4 = Grandparents, 5 = Otherrelatives, 6 = Friends, 7 =Foster home,

8 = Agencyor institution, 9 = Someoneelse, 99 = Blank

famtime Has anyone in your family, such as your mother, father, brother,

sister, ever served time in jail or prison?

1 = Yes, 2 = No, 7 = Don’t know, 9 = Blank


most serious crimein current offense

1 = violent (e.g., murder, rape,robbery, assault)

2 = property (e.g.burglary, larceny, arson,fraud, motor vehicletheft)

3 = drug (drug possession or tracking)

4 = public order(weapons violation, perjury, failure to appearin court)

5 = juvenile status offense (truancy, running away, incorrigible behavior)

9 = missing


ever put on probation or sent to correctional institution for a violent offense prior to being senthere

1 = yes, 0 = no

numarr number of timesarrested (99=missing)
probtn number of timeson probation (99=missing)
agefirst age first arrested (99=missing)

Did you drink alcohol at all during the year before being sent here this time?

1 = Yes;2 = No, didn’t drinkduring year before;3 = No, don’t drink at all, 9=missing

everdrug Ever used illegaldrugs; 0=no, 1=yes,9=missing


  • Please do the following data creation and datacleaning before the analyses.
  1. Create “psu” variabledescribed in above.
  2. Recode “99” to “.” (missing) for the variablesbelow.
    1. Age, livewith, numarr, probtnand agefirst
    2. Recode “9” to “.” (missing)for the variables below.
      1. Race, ethnicity, sex, crimtype, alcuse, and everdrug
    3. For “famtime”, “7” or “9” shouldbe recorded as “.” (missing).
    4. Create the new variable for “crimtype”.
      1. Those with missing “crimtype” should also have missing for a new variable (violent_new).
      2. If “crimtype” is 1 (violent), then the new variable will be defined as “youth with violent offense” (violent_new = 1).
      3. Otherwise, the new variable will be defined as “No violent offense” (violent_new = 0).
    5. Create the new variable for “livewith”.
      1. Those with missing “livewith” should also have missing for a new variable (livewith_new).
      2. If “livewith” is either 1 (Mother only) or 2 (Father only), then the new variablewill be defined as “Single parent” (livewith_new = 1).
      3. Else if “livewith” is 3 (Both mother and father), then the new variable will be defined as “Both parents” (livewith_new = 2).
      4. Otherwise, the new variable will be definedas “Others” (livewith_new = 3).
    6. Create the new variable for “race”.
      1. Those with missing “race” should also have missingfor a new variable (race_new).
      2. If “race” is 1 (White), then the new variable will be defined as “White” (race_new = 1).
      3. Else if “race” is 2 (Black), then the new variable will be defined as “Black” (race_new = 2).
      4. Otherwise, the new variable will be definedas “Others” (race_new = 3).
  3. Create the new variable for “age”.
    1. Those with missing “age” should also have missingfor a new variable (age_new).
    2. If “age”is less than or equal to 15, then the new variable will be definedas “<=15” (age_new = 1).
    3. Else if “age” is either 16 or 17, then the new variable will be defined as “16 or 17” (age_new = 2).
    4. Otherwise, the new variablewill be definedas “18+” (age_new= 3).
    5. Create the new variable for “alcuse”.
      1. Those with missing “alcuse”should also have missing for a new variable (alcuse_new).
      2. If “alcuse” is 1 (Yes), then the new variable will be defined as “Yes” (alcuse_new = 1).
      3. Otherwise, the new variablewill be defined as “No” (alcuse_new = 2).
  • The aims of study are
  1. To estimate the proportion of youths who
    1. Are age 15 or younger
    2. Are held for a violentoffense
    3. Lived with both parents when growing up
    4. Grew up primarily in a single-parent family
    5. Are male
    6. Are White
    7. Are Black
    8. Are Hispanic
    9. Have drunk alcohol at all during the year before beingsent here
    10. Have used illegaldrugs
  2. To estimate the average age first arrested.
  3. To estimate the average number of times arrested and the average number of times on probation for the youth who was sent to the institution for a violent offense(using domain analysis).
  4. To test the association between “Was anyone in your family ever served time in jail or prison?” and “have you ever been put on probation or sent to a correctional institution for a violent offense prior to being sent here?”
  5. To fit a logistic regression model predicting whether the youth with a violent offense from the following variables. First fit the model with one variable at a time, and then use the significant variables for the final model by including those variables in the same model.
    1. Age (using 18+ as a reference group)
    2. Gender (using femaleas a reference group)
    3. Race (usinga race variable with 3 categories and using white race as a reference group)
    4. Have drunk alcohol at all during the year before being sent here (using a variable with 2 categories and using “No” as a reference group)
    5. Ever used illegal drugs (using a variable with 2 categories and using “No” as a reference group)
    6. People you live with (using a variable with 3 categories and using a singleparent as a reference group)
    7. Having a family member who has ever served time in jail or prison (using a variable with 2 categories and using “No” as a reference group)