![Page 1: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/1.jpg)
Storage TypesDisplay Format
String ↔ Numeric(Dis)connect Characters
Jeehoon [email protected]
Fall 2017
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 2: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/2.jpg)
Storage TypesI Storage types
I Numbers (digits of accuracy)I Integers: byte(2), int(4), long(9)
I Floating points: float(7), double(16)
I Strings: str1, str2, ..., str#
where str# can hold words with # characters or less
I The default storage type is float
I Storing a variable containing numbers > 7 digitsI 8-9 digit integer: gen long varname
I Otherwise: gen double varname
I Changing the storage type of an existing variable:recast type varname
I Use compress to save memory by storing variables in the smallesttypes without losing precision
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 3: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/3.jpg)
Storage TypesI Storage types
I Numbers (digits of accuracy)I Integers: byte(2), int(4), long(9)
I Floating points: float(7), double(16)
I Strings: str1, str2, ..., str#
where str# can hold words with # characters or less
I The default storage type is float
I Storing a variable containing numbers > 7 digitsI 8-9 digit integer: gen long varname
I Otherwise: gen double varname
I Changing the storage type of an existing variable:recast type varname
I Use compress to save memory by storing variables in the smallesttypes without losing precision
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 4: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/4.jpg)
Storage Types: ExampleI set obs 1
gen var = 0.2
tab var if var == 0.2
⇒ no observation
I ProblemsI Numbers are stored in binary form and most decimals have no
exact representations in binary (0.2 → 0.00110011...)
I 0.2 is stored as 0.20000000298023224 in float
0.20000000000000001 in double
I When you create the variable var, 0.2 is stored in float
but Stata does all calculations in double precision
I Two ways to deal with this issueI Store data as double
I tab var if var==float(0.2)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 5: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/5.jpg)
Storage Types: ExampleI set obs 1
gen var = 0.2
tab var if var == 0.2
⇒ no observation
I ProblemsI Numbers are stored in binary form and most decimals have no
exact representations in binary (0.2 → 0.00110011...)
I 0.2 is stored as 0.20000000298023224 in float
0.20000000000000001 in double
I When you create the variable var, 0.2 is stored in float
but Stata does all calculations in double precision
I Two ways to deal with this issueI Store data as double
I tab var if var==float(0.2)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 6: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/6.jpg)
Storage Types: ExampleI set obs 1
gen var = 0.2
tab var if var == 0.2
⇒ no observation
I ProblemsI Numbers are stored in binary form and most decimals have no
exact representations in binary (0.2 → 0.00110011...)
I 0.2 is stored as 0.20000000298023224 in float
0.20000000000000001 in double
I When you create the variable var, 0.2 is stored in float
but Stata does all calculations in double precision
I Two ways to deal with this issueI Store data as double
I tab var if var==float(0.2)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 7: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/7.jpg)
Display Format
I Specify the display formatformat varlist %fmt
I Numeric formatsI Fixed format: %w.df
General format: %w.dg
where w : the total width of the displayd : the number of decimals (fixed format)
For general format, Stata decides the number of decimals todisplay (if d > 0, d indicates the maximum number of decimalplaces)
I String format: %wswhere w : the width of characters
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 8: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/8.jpg)
Display Format: example
I Default formatbyte, int: %8.0glong: %12.0gfloat: %9.0gdouble: %10.0g
I Examplesclear
set obs 1
gen double pi = 3.1415926535
list pi ⇒ 3.1415927format pi %8.0 g⇒ 3.14159format pi %8.5 f⇒ 3.14159
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 9: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/9.jpg)
Inspecting DataI sysuse uslifeexp, clear
browse
I list
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 10: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/10.jpg)
Inspecting DataI describe
I codebook region
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 11: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/11.jpg)
Strings (pure text)� NumericsI String variable → numeric variable
encode country, gen(country code)
I Numeric variable → string variabledecode country code, gen(county str)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 12: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/12.jpg)
Strings (numeric text)� Numerics
I String variable → numeric variable
I destring varlist, {gen(varname)|replace} [option]
I [option]
I ignore(‘‘chars’’): remove the nonnumeric charactersspecified
I force: treat any values containing nonnumeric characters asmissing values
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 13: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/13.jpg)
Strings (numeric text)� Numerics
I Example:use http://www.stata-press.com/data/r13/destring2
I destring price, gen(priceA) ignore(‘‘$ ,’’)
destring price, gen(priceB) force
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 14: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/14.jpg)
Strings (numeric text)� Numerics
I Numeric variable → string variableI tostring varlist, {gen(varname)|replace} [option]
I [option]
I format(%fmt): convert using specified formatI force: convert to string even if it entails information loss
I tostring priceA, gen(price strA)
tostring priceA, gen(price strB) format(%8.1f) force
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 15: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/15.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month = substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 16: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/16.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year =
substr(date,1,4)
gen month = substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 17: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/17.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month =
substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 18: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/18.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month = substr(date,6,2)
gen day =
substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 19: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/19.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month = substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 20: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/20.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month = substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 21: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/21.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month = substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer =
substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 22: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/22.jpg)
Disconnect the Characters of VariablesI substr(str,(-)n,m): extract a substring from str starting at
position n (from the end of a string) for a length of m
I gen year = substr(date,1,4)
gen month = substr(date,6,2)
gen day = substr(date,-2,.)
I gen length = strlen(priceA)
gen decimal = substr(date,-2,.)
gen integer = substr(date,1,length-3)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 23: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/23.jpg)
Connect the Characters of Variables
I gen date1 = year+‘‘ ’’+month+‘‘ ’’+day
egen date2 = concat(year month day), punct(‘‘ ’’)
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 24: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/24.jpg)
SortI Arrange the data in ascending order
I sysuse uslifeexp, clear
I sort le sort year le
I gsort -year
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters
![Page 25: Storage Types Display Format String Numeric (Dis)connect](https://reader031.vdocuments.site/reader031/viewer/2022012019/6168752cd394e9041f6fb47b/html5/thumbnails/25.jpg)
Sort: Applications
I Create a lagged variableI sort year
gen le lag = le[ n-1]
I Finding duplicatesI sort year
list if year == year[ n-1]
Jeehoon Han [email protected] Storage Types Display Format String ↔ Numeric (Dis)connect Characters